Faker: Generate Realistic Test Data in Python with One Line of Code

October 11, 2025

Faker: Generate Realistic Test Data in Python with One Line of Code

Khuyen Tran

Motivation
Basics of Faker
Location-Specific Data Generation
Create Text
Create Profile Data
Create Random Python Datatypes
Conclusion

Motivation

Let’s say you want to create data with certain data types (bool, float, text, integers) with special characteristics (names, address, color, email, phone number, location) to test some Python libraries or specific implementation. But it takes time to find that specific kind of data. You wonder: is there a quick way that you can create your own data?

What if there is a package that enables you to create fake data in one line of code such as this:

fake.profile()

{
    'address': '076 Steven Trace\nJillville, ND 12393',
    'birthdate': datetime.date(1981, 11, 19),
    'blood_group': 'O-',
    'company': 'Johnson-Rodriguez',
    'current_location': (Decimal('61.969848'), Decimal('121.407164')),
    'job': 'Patent examiner',
    'mail': 'ohicks@hotmail.com',
    'name': 'Katie Romero',
    'residence': '271 Smith Wells\nMichaelport, MN 40933',
    'sex': 'F',
    'ssn': '281-84-3963',
    'username': 'eparker',
    'website': ['https://www.gonzalez.com/', 'https://rogers-scott.com/']
}

This can be done with Faker, a Python package that generates fake data for you, ranging from a specific data type to specific characteristics of that data, and the origin or language of the data. Let’s discover how we can use Faker to create fake data.

💻 Get the Code: The complete source code and Jupyter notebook for this tutorial are available on GitHub. Clone it to follow along!

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

Basics of Faker

Start with installing the package:

pip install Faker

Import Faker:

from faker import Faker

fake = Faker()

Some basic methods of Faker:

print(fake.color_name())
print(fake.name())
print(fake.address())
print(fake.job())
print(fake.date_of_birth(minimum_age=30))
print(fake.city())

Tan
Kristin Buck
715 Peter Views
Abigailport, ME 57602
Systems analyst
1946-03-07
Evanmouth

Let’s say you are an author of a fiction book who want to create a character but find it difficult and time-consuming to come up with a realistic name and information. You can write:

name = fake.name()
color = fake.color_name()
city = fake.city()
job = fake.job()

print(f'Her name is {name}. She lives in {city}. Her favorite color is {color}. She works as a {job}')

Her name is Debra Armstrong. She lives in Beanview. Her favorite color is GreenYellow. She works as a Lawyer

With Faker, you can generate a persuasive example instantly!

Location-Specific Data Generation

Luckily, we can also specify the location of the data we want to fake. Maybe the character you want to create is from Italy. You also want to create instances of her friends. Since you are from the US, it is difficult for you to generate relevant information to that location. That can be easily taken care of by adding location parameter in the class Faker:

fake = Faker('it_IT')

for _ in range(10):
    print(fake.name())

Angelica Donarelli-Marangoni
Rosaria Castiglione
Federica Iacovelli
Puccio Armellini
Dina Donini-Alboni
Dott. Carolina Marrone
Olga Nosiglia
Graziella Russo
Paulina Galiazzo
Dott. Riccardo Padovano

Or create information from multiple locations:

fake = Faker(['ja_JP','zh_CN','es_ES','en_US','fr_FR'])

for _ in range(10):
    print(fake.city())

齐齐哈尔市
Blakefort
North Joeborough
玉兰市
Saint Suzanne-les-Bains
Melilla
調布市
富津市
Maillot-sur-Mer
East Jamesshire

If you are from these specific countries, I hope you recognize the location. In case you are curious about other locations that you can specify, check out the doc here.

Create Text

Create Random Text

We can create random text with:

fake = Faker('en_US')
print(fake.text())

Gas threat perhaps minute energy thus. Relate group science car discussion budget art.
Let visit reach senior. Story once list almost. Enough major everyone.

Try with the Vietnamese language:

fake = Faker('vi_VN')
print(fake.text())

Như không cho số vậy tại đến. Hơn các thay. Khi từ cũng không rất là.
Gần được cho có nơi như vẫn cho. Nơi đi về giống.
Mà cũng từ nhưng lớn. Từng của nếu khi như nhưng.

None of these random text makes sense, but it is a good way to quickly create text for testing.

Create Text from Selected Words

Or we can also create text from a list of words:

fake = Faker()
my_information = ['dog','swimming', '21', 'slow', 'girl', 'coffee', 'flower','pink']

print(fake.sentence(ext_word_list=my_information))
print(fake.sentence(ext_word_list=my_information))

Coffee pink coffee.
Dog pink 21 pink.
```text
## Create Profile Data {#create-profile-data}

We can quickly create a profile with:

```python
fake = Faker()
fake.profile()

{'job': 'Nurse, adult',
 'company': 'Johnson, Moore and Glover',
 'ssn': '762-56-8929',
 'residence': '742 Shane Groves\nLake Jasminefort, GU 12583',
 'current_location': (Decimal('-77.3842165'), Decimal('7.407430')),
 'blood_group': 'B-',
 'website': ['https://brooks.com/'],
 'username': 'brownamanda',
 'name': 'Carolyn Navarro',
 'sex': 'F',
 'address': '505 Lewis Grove Apt. 588\nHowardville, ID 68181',
 'mail': 'larry00@hotmail.com',
 'birthdate': datetime.date(1946, 6, 13)}

As we can see, most relevant information about a person is created with ease, even with mail, ssn, username, and website.

What is even more useful is that we can create a dataframe of 100 users from different countries:

import pandas as pd

fake = Faker(['it_IT','ja_JP', 'zh_CN', 'de_DE','en_US'])
profiles = [fake.profile() for i in range(100)]

pd.DataFrame(profiles).head()

	job	company	ssn	residence	current_location	blood_group	website	username	name	sex	address	mail	birthdate
0	Physiological scientist	Sobrero-Mazzanti Group	CLGTNO59H42A473Z	Incrocio Cabrini, 14 Appartamento 59\n74100, L…	(-88.2637715, 149.968584)	AB+	[http://federici-endrizzi.it/, http://www.paru…]	giuliagreco	Dott. Liliana Serraglio	F	Vicolo Milo, 0\n64020, Ripattoni (TE)	giolittiflavio@gmail.com	1998-10-10
1	花火師	阿部運輸株式会社	701-41-9799	和歌山県印旛郡本埜村鳥越20丁目23番18号	(79.245074, 109.117174)	O+	[https://suzuki.com/, http://ishikawa.jp/]	lyamamoto	斉藤明美	F	東京都江戸川区神明内40丁目12番20号	akemiyamada@yahoo.com	1916-12-09
2	小説家	小林食品株式会社	103-28-5057	島根県富津市細野7丁目16番1号	(-84.3304275, 38.093874)	A+	[https://tanaka.jp/, http://www.fujita.net/, h…]	minoru62	渡辺英樹	M	青森県川崎市川崎区長畑22丁目27番12号	minoru35@yahoo.com	2008-02-17
3	ゲームクリエイター	佐藤水産有限会社	123-85-7967	宮城県調布市隼町3丁目22番12号アーバン台東327	(-49.3689775, -134.762867)	AB-	[http://www.sato.org/, http://kato.net/, http:…]	ayamamoto	鈴木洋介	M	栃木県川崎市中原区虎ノ門30丁目27番20号	yuta56@hotmail.com	1917-01-25
4	薬剤師	合同会社高橋建設	891-98-2169	山梨県山武郡横芝光町轟4丁目22番10号コート天神島159	(-62.1493985, -105.171377)	B+	[http://yamashita.jp/, http://www.shimizu.com/]	yosukekimura	田中真綾	F	山口県府中市下吉羽6丁目20番2号	hayashiyuki@yahoo.com	2001-08-09

Create Random Python Datatypes

If we just care about the type of your data, without caring so much about the information, we can easily generate random datatypes such as:

Boolean:

print(fake.pybool())

False

A list of 5 elements with different data_type:

print(fake.pylist(nb_elements=5, variable_nb_elements=True))

['juan28@example.org', 8515, 6618, 'UexWQJkGrJFGBAVfHgUt']

A decimal with 5 left digits and 6 right digits (after the .):

print(fake.pydecimal(left_digits=5, right_digits=6, positive=False, min_value=None, max_value=None))

-26114.564612

You can find more about other Python datatypes that you can create here.

Conclusion

I hope you find Faker a helpful tool to create data efficiently. You may find this tool useful for what you are working on or may not at the moment. But it is helpful to know that there exists a tool that enables you to generate data with ease for your specific needs such as testing.

Feel free to check out more information about Faker here.

📚 Want to go deeper? Learning new techniques is the easy part. Knowing how to structure, test, and deploy them is what separates side projects from real work. My book shows you how to build data science projects that actually make it to production. Get the book →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

PDF Table Extraction: Docling vs Marker vs LlamaParse Compared

March 7, 2026

Portable DataFrames in Python: When to Use Ibis, Narwhals, or Fugue

February 21, 2026

5 Python Tools for Structured LLM Outputs: A Practical Comparison

January 30, 2026

Faker: Generate Realistic Test Data in Python with One Line of Code

Faker: Generate Realistic Test Data in Python with One Line of Code

Khuyen Tran

Table of Contents

Motivation

Stay Current with CodeCut

Basics of Faker

Location-Specific Data Generation

Create Text

Create Random Text

Create Text from Selected Words

Create Random Python Datatypes

Conclusion

Stay Current with CodeCut

Related Posts

Leave a Comment Cancel Reply

Drop a line

Get in touch

Follow Us on Social Media

Faker: Generate Realistic Test Data in Python with One Line of Code

Faker: Generate Realistic Test Data in Python with One Line of Code

Khuyen Tran

Table of Contents

Motivation

Stay Current with CodeCut

Basics of Faker

Location-Specific Data Generation

Create Text

Create Random Text

Create Text from Selected Words

Create Random Python Datatypes

Conclusion

Stay Current with CodeCut

Related Posts

Leave a Comment Cancel Reply

Work with Khuyen Tran

Work with Khuyen Tran