Python Data Engineer Learning Repository

Welcome to the Python Data Engineer learning repository! This repo contains a structured, practical set of Jupyter notebooks for learning core Python concepts, especially with a focus on data engineering. Each topic is covered with hands-on examples and explanations, and links are provided to the code for easy reference.

Note: This summary is based on the top-level files; for a full list of all tutorials and scripts, check the GitHub repository contents.

📚 Topics Covered

1. Python Introduction

Overview: Introduction to Python, variables, data types, and basic operations.
Key Concepts:
- Printing and string manipulation
- Variable assignment and naming
- Numeric, string, and boolean data types
- Type conversion, built-in functions, and string methods
- List basics and common list operations

2. Python Conditions

Overview: Mastering conditional statements for decision making.
Key Concepts:
- if, elif, else statements
- Comparison and logical operators
- Nested conditions and practical examples

3. Python Loops

Overview: Using loops to automate repetitive tasks.
Key Concepts:
- for and while loops
- Loop control (break, continue, pass)
- Looping through lists, strings, and dictionaries

4. Python Functions

Overview: Writing reusable blocks of code with functions.
Key Concepts:
- Defining and calling functions
- Parameters, return values, and scope
- Lambda functions and higher-order functions

5. Python Operators

Overview: Using operators to manipulate data.
Key Concepts:
- Arithmetic, assignment, comparison, logical, bitwise, and membership operators
- Precedence and associativity

6. Python Collections

Overview: Mastering data structures for efficient storage and retrieval.
Key Concepts:
- Lists, tuples, sets, dictionaries
- When and how to use each collection
- Real-world data engineering examples using collections

7. Python Modules & Packages

Overview: Organizing and reusing code with modules and packages.
Key Concepts:
- The difference between modules, packages, and libraries (with LEGO analogies)
- Importing and using built-in and external libraries (e.g., Pandas, NumPy, Matplotlib, Requests, Scikit-learn)
- Creating custom modules and packages

8. Randoms Directory

Overview: Working with randomness, generating random numbers and data for testing and simulations.
Key Concepts:
- Using Python’s random module for numbers, choices, and shuffling
- Generating random data for data engineering tasks
- Introduction to the faker library for synthetic data creation
- Practical examples: random sampling, data anonymization

9. CSV Directory

Overview: CSV File handling and manipulation for data storage and retrieval.
Key Concepts:
- Reading and writing text and CSV files
- Using csv file with Pandas library
- File and directory operations using os and shutil
- Handling file paths and exceptions
- Data extraction and ingestion from files

10. JSON Directory

Overview: Managing JSON data formats for configuration and data exchange.
Key Concepts:
- Reading and writing JSON files with Python’s json module
- Parsing and serializing complex JSON structures
- Real-world use cases: configuration files, API responses
- Data transformation between JSON and Python objects

11. Blocks Directory

Overview: Code blocks and reusable scripts for modular data engineering workflows.
Key Concepts:
- Encapsulating logic in code blocks (functions, scripts)
- Organizing reusable code for ETL pipelines
- Example templates for batch processing and automation

12. Logging Directory

Overview: Logging and monitoring data engineering processes.
Key Concepts:
- Using Python’s logging module for event tracking
- Setting up log formats, levels, and handlers
- Best practices for error handling and process monitoring
- Writing logs to files and integrating with external tools

📎 How to Use This Repo

Browse Notebooks: Start with the Jupyter notebooks in the main directory for a structured learning path.
Explore Directories: Check out the additional folders for more scripts and data.
Try the Code: Run the notebooks locally or in an online Jupyter environment.
Contribute: Pull requests to add new topics or improve examples are welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
10-json		10-json
11-blocks		11-blocks
12-logging		12-logging
7-modules		7-modules
8-random		8-random
9-csv		9-csv
streamlit		streamlit
1-Python-Introduction.ipynb		1-Python-Introduction.ipynb
2-Python-Conditions.ipynb		2-Python-Conditions.ipynb
3-Python-Loops.ipynb		3-Python-Loops.ipynb
4-Python-Functions.ipynb		4-Python-Functions.ipynb
5-Python-Operators.ipynb		5-Python-Operators.ipynb
6-Python-Collection.ipynb		6-Python-Collection.ipynb
7-Python-Modules & Packages.ipynb		7-Python-Modules & Packages.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Python Data Engineer Learning Repository

📚 Topics Covered

1. Python Introduction

2. Python Conditions

3. Python Loops

4. Python Functions

5. Python Operators

6. Python Collections

7. Python Modules & Packages

8. Randoms Directory

9. CSV Directory

10. JSON Directory

11. Blocks Directory

12. Logging Directory

📎 How to Use This Repo

🔗 Explore More

About

Uh oh!

Releases

Packages

Languages

gkdevops/python-data-engineer

Folders and files

Latest commit

History

Repository files navigation

Python Data Engineer Learning Repository

📚 Topics Covered

1. Python Introduction

2. Python Conditions

3. Python Loops

4. Python Functions

5. Python Operators

6. Python Collections

7. Python Modules & Packages

8. Randoms Directory

9. CSV Directory

10. JSON Directory

11. Blocks Directory

12. Logging Directory

📎 How to Use This Repo

🔗 Explore More

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages