Amazon.com Widgets All posts tagged 'Generator'

WilliaBlog.Net

I dream in code

About the author

Robert Williams is an internet application developer for the Salem Web Network.
E-mail me Send mail
Code Project Associate Logo
Go Daddy Deal of the Week: 30% off your order at GoDaddy.com! Offer expires 11/6/12

Recent comments

Archive

Authors

Tags

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.


Random Sample Data Generator

I recently found myself testing a webservice which required simple registration data - Name, Address, City, State, Zip/Postal Code, Phone, Email, etc. Real world testing requires navigating to the appropriate page on our site, selecting a product and completing the registration form. I don't know about you, but I'm pretty lazy when it comes to testing like this. I soon find myself entering the same test data every time, and before long end up typing gibberish like "ASDF" and "QWERTY" to speed things along. Anyone who has tested like this before, will understand the limitations to this approach and how it leads to major bugs slipping through the net. So naturally, I soon found myself writing some unit tests to do the heavy lifting for me. This is an MVC Site, so writing the unit tests was a breeze, but the sample data remained a problem. How could I produce data that would be different every time, but not be gibberish, and really look like real world data?

Well one approach would be to pull real data from our database and recycle it. Such practices, however, can violate a number of data privacy regulations. For example, the 1996 Health Insurance Portability and Accountability Act (HIPAA) mandates companies restrict access to people’s personal health data on a “need to know” basis. Likewise, the Sarbanes-Oxley Act of 2002 requires companies to control access and track changes to systems handling corporate financial information. In addition, over 30 states have passed data breach notification laws requiring companies to notify consumers if their personal information may have been compromised. This includes such things as a person’s name and address, date of birth, social security number, and credit card and bank account numbers. (Mathew Schwartz - The Dangers of Testing with Real Data).

So, the problem then, is how can you get your hands on quality sample data that won't violate privacy regulations? After looking at an Excel Add-in with similar intent, I chose to write a Random Sample Data Generator in C#. The first thing I did was create an XML File full of real first names taken from a list found here. Then I added a list of real last names found on the US Census site. Next I found a list of fake company names and added that. Our database already had a list of countries and states, so I pulled those in too. I created a list of popular street names (Elm, Main, First, Second, Walnut, etc.) and street name suffixes like Road, Street, Blvd., etc. I found a list of common city names somewhere and added that too, as well as name prefixes (Mr., Mrs, Dr., etc) and suffixes (III, Jr., Sr., etc.). Finally, in order to create credible email addresses, I took the fake company name list and turned them into possible domain names. Numeric data such as zip codes and phone numbers is easy to generate, no need to list those unless you must have real zip codes and real area codes. Currently, zip codes and phone numbers are just random digits, but if you wanted to extend it to use real area codes and real zip codes you could add them to the xml file, and write new methods to generate them. Obviously, a little more complex code would be required to pull a matching state, zip code and area code.

All that remained was to write some C# code that could randomly pull data from the lists now stored in my xml file, and here it is:

namespace DataGenerator

{

    using System;

    using System.IO;

    using System.Linq;

    using System.Reflection;

    using System.Text;

    using System.Xml;

    using System.Xml.Linq;

 

    using Extensions;

 

    public static class Generate

    {

        private static readonly XDocument dataDocument;

 

        static Generate()

        {

            if (null == dataDocument)

            {

                using (Stream datafileStream = Assembly.GetExecutingAssembly().GetManifestResourceStream("DataGenerator.Data.xml"))

                {

                    if (datafileStream != null)

                    {

                        XmlTextReader xmlReader = new XmlTextReader(datafileStream);

                        dataDocument = XDocument.Load(xmlReader);

                    }

                }

            }

        }

 

        /// <summary>

        /// Random Number Generator

        /// </summary>

        /// <param name="min"></param>

        /// <param name="max"></param>

        /// <example>int myInt = GetRandomInt(5, 1000); // gives in random integer between 5 and 1000</example>

        public static int RandomInt(int min, int max)

        {

            Random rnd = new Random();

            return rnd.Next(min, max);

        }

 

        /// <summary>

        /// Returns a string representing a common name prefix, e.g. Mr., Mrs., Dr., etc.

        /// </summary>

        public static string RandomNamePrefix()

        {

            var randomPrefix = dataDocument.Descendants("data").Descendants("prefixes").Descendants("prefix").ToList().RandomItem(new Random());

            return randomPrefix.Value;

        }

 

        /// <summary>

        /// Returns a string representing a persons last name, randomly selected from actual first names.

        /// </summary>

        public static string RandomFirstName()

        {

            var randomName = dataDocument.Descendants("data").Descendants("names").Descendants("first").ToList().RandomItem(new Random());

            return randomName.Value;

        }

 

        /// <summary>

        /// Returns a string representing a persons last name, randomly selected from actual last names.

        /// </summary>

        public static string RandomLastName()

        {

            var randomName = dataDocument.Descendants("data").Descendants("names").Descendants("last").ToList().RandomItem(new Random());

            return randomName.Value;

        }

 

        /// <summary>

        /// Returns a string representing a persons full name, randomly selected from actual first and last names

        /// </summary>

        public static string RandomFullName()

        {

            return string.Format("{0} {1}", RandomFirstName(), RandomLastName());

        }

 

        /// <summary>

        /// Returns a string representing a common name suffix, e.g. III, Jr., M.D., etc.

        /// </summary>

        public static string RandomNameSuffix()

        {

            var randomSuffix = dataDocument.Descendants("data").Descendants("suffixes").Descendants("suffix").ToList().RandomItem(new Random());

            return randomSuffix.Value;

        }

 

        /// <summary>

        /// Returns a random company name based on a list of fake company names

        /// </summary>

        public static string RandomCompanyName()

        {

            var randomName = dataDocument.Descendants("data").Descendants("companies").Descendants("company").Descendants("name").ToList().RandomItem(new Random());

            return randomName.Value;

        }

 

        /// <summary>

        /// Returns a string representing a street address, e.g. 123 First Ave.

        /// </summary>

        public static string RandomStreetAddress()

        {

            var randomStreet = dataDocument.Descendants("data").Descendants("addresses").Descendants("streetNames").Descendants("streetName").ToList().RandomItem(new Random());

            var randomStreetSuffix = dataDocument.Descendants("data").Descendants("addresses").Descendants("streetSuffixes").Descendants("streetSuffix").ToList().RandomItem(new Random());

            return string.Format("{0} {1} {2}", RandomInt(1, 1999), randomStreet.Value, randomStreetSuffix.Value);

        }

 

        /// <summary>

        /// Returns a random name that could be a city

        /// </summary>

        public static string RandomCity()

        {

            var randomCity = dataDocument.Descendants("data").Descendants("cities").Descendants("city").Descendants("name").ToList().RandomItem(new Random());

            return randomCity.Value;

        }

 

        /// <summary>

        /// Returns a real US/Canadian State name at random, e.g Texas

        /// </summary>

        public static string RandomStateName()

        {

            var randomState = dataDocument.Descendants("data").Descendants("states").Descendants("state").Descendants("name").ToList().RandomItem(new Random());

            return randomState.Value;

        }

 

        /// <summary>

        /// Returns a real US/Canadian State code at random, e.g TX

        /// </summary>

        public static string RandomStateCode()

        {

            var randomState = dataDocument.Descendants("data").Descendants("states").Descendants("state").Descendants("code").ToList().RandomItem(new Random());

            return randomState.Value;

        }

 

        /// <summary>

        /// Returns a Random 5 digits between 11111 and 99999 to use for a zip code

        /// </summary>

        /// <remarks>Unlikely to produce many real zipcodes that the postoffice would recognize</remarks>

        public static string RandomZipCode()

        {

            return RandomInt(11111, 99999).ToString();

        }

 

        /// <summary>

        /// Returns a real country name at random

        /// </summary>

        public static string RandomCountry()

        {

            var randomCountry = dataDocument.Descendants("data").Descendants("countries").Descendants("country").Descendants("name").ToList().RandomItem(new Random());

            return randomCountry.Value;

        }

 

        /// <summary>

        /// Returns a real looking email address

        /// </summary>

        public static string RandomEmailAddress()

        {

            var randomDomain = dataDocument.Descendants("data").Descendants("domainNames").Descendants("domainName").ToList().RandomItem(new Random());

            var randomDomainSuffix = dataDocument.Descendants("data").Descendants("domainNameSuffixes").Descendants("suffix").ToList().RandomItem(new Random());

            return string.Format("{0}.{1}@{2}.{3}", RandomFirstName(), RandomLastName(), randomDomain.Value, randomDomainSuffix.Value);

        }

 

        /// <summary>

        /// Returns a 10 digit phone number in the format (###) ###-####

        /// </summary>

        /// <remarks>Area codes are unlikely to be real</remarks>

        public static string RandomPhone()

        {

            StringBuilder phone = new StringBuilder();

 

            // Lets generate 10 numbers

            while (phone.Length < 10)

            {

                int next = RandomInt(1, 999);

                phone.Append(next.ToString());

            }

 

            return String.Format("{0:(###) ###-####}", Convert.ToInt64(phone.ToString().Substring(0, 10)));

        }

    }

}

I chose to embed the xml file as a resource within the dll, simply because that way we never need to worry about pathing issues. I used an Extension method I found here to randomly select an XML node:

namespace DataGenerator.Extensions

{

    using System;

    using System.Collections.Generic;

 

    public static class IEnumerableExtensions

    {

        public static T RandomItem<T>(this List<T> list, Random rg)

        {

            if (list == null)

            {

                throw new ArgumentNullException("list");

            }

 

            if (rg == null)

            {

                throw new ArgumentNullException("rg");

            }

 

            int index = rg.Next(list.Count);

            return list[index];

        }

    }

}

Using the class couldn't be easier:

            Console.WriteLine(DataGenerator.Generate.RandomFullName());

            Console.WriteLine(DataGenerator.Generate.RandomStreetAddress());

            Console.WriteLine(DataGenerator.Generate.RandomCity());

            Console.WriteLine(DataGenerator.Generate.RandomStateCode());

            Console.WriteLine(DataGenerator.Generate.RandomZipCode());

            Console.WriteLine(DataGenerator.Generate.RandomCountry());

            Console.WriteLine(DataGenerator.Generate.RandomPhone());

 

Download the full source project (Visual Studio 2010, C# 4.0):

DataGenerator.zip (219.40 kb)


Posted by on Friday, July 30, 2010 5:40 AM
Permalink | Comments (0) | Post RSSRSS comment feed