PySpark SQL Learning Platform Topics

Python Resource Topics

  • What is Python
  • When to use Python
  • Installing Python
  • Running Python
  • Python versions and version management
  • Python ecosystem and tools
  • Python vs other languages
  • Python for data engineering
  • Python syntax and indentation
  • Python comments
  • Print Hello World
  • Python variables
  • Python numbers basics
  • Python strings and escape sequences
  • Python print with sep and end
  • Python input and output
  • Arithmetic Operators
  • Comparison Operators
  • Logical Operators
  • Assignment & Compound Operators
  • String Operators & Concatenation
  • Operator Precedence & Expression Evaluation
  • Type Conversion & Casting
  • String Indexing & Slicing
  • String Case & Whitespace Methods
  • String Search & Replace Methods
  • f-Strings & String Formatting
  • String split() and join()
  • String Check Methods
  • Multiline Strings & Raw Strings
  • String Operations & Real Patterns
  • if Statements & Single Conditions
  • if-elif-else Statements
  • Nested if Statements
  • Ternary Operator & Inline Conditions
  • for Loops - Basics
  • Loop Control & Patterns
  • while Loops & Real ETL Scenarios
  • Nested Loops & Patterns
  • Define and Call Functions
  • Function Parameters & Arguments
  • Return Values & Multiple Returns
  • Default & Keyword Arguments
  • Variable Scope - Local & Global
  • Lambda Functions
  • *args and **kwargs Basics
  • List Creation & Indexing
  • List Slicing & Filtering Operations
  • List Modification
  • List Comprehension Basics
  • Tuples - Creation & Unpacking
  • Sets - Creation & Operations
  • Dictionary Creation & Access
  • Dictionary Modification & List of Dictionaries
  • Dictionary Iteration & Methods
  • Nested Collections
  • len, type, min, max, sum
  • sorted() and reversed()
  • enumerate() in Loops
  • zip() - Combining Iterables
  • range() Advanced Patterns
  • map() and filter() Basics
  • Mini ETL Workflow
  • Beginner Complete
  • Python classes and objects
  • Python inheritance
  • Python special methods
  • Python read write files
  • Python CSV and JSON
  • Python with statement
  • Python try except
  • Python custom exceptions
  • Python lambda
  • Python map filter reduce
  • Python decorators
  • Python requests HTTP
  • Python web scraping
  • Python pytest
  • Python asyncio
  • Python async await
  • Python threading
  • Python multiprocessing
  • Python profiling
  • Python decorators advanced
  • Python metaclasses
  • Python type hints
  • Python secure coding
  • Python cryptography
  • Python NumPy Pandas
  • Python Flask Django
  • Python Selenium
  • Python Data Engineering

Python Practice Topics

  • Python syntax and indentation
  • Python comments
  • Print Hello World
  • Python variables
  • Python numbers basics
  • Python strings and escape sequences
  • Python print with sep and end
  • Python input and output
  • Arithmetic Operators
  • Comparison Operators
  • Logical Operators
  • Assignment & Compound Operators
  • String Operators & Concatenation
  • Operator Precedence & Expression Evaluation
  • Type Conversion & Casting
  • String Indexing & Slicing
  • String Case & Whitespace Methods
  • String Search & Replace Methods
  • f-Strings & String Formatting
  • String split() and join()
  • String Check Methods
  • Multiline Strings & Raw Strings
  • String Operations & Real Patterns
  • if Statements & Single Conditions
  • if-elif-else Statements
  • Nested if Statements
  • Ternary Operator & Inline Conditions
  • for Loops - Basics
  • Loop Control & Patterns
  • while Loops & Real ETL Scenarios
  • Nested Loops & Patterns
  • Define and Call Functions
  • Function Parameters & Arguments
  • Return Values & Multiple Returns
  • Default & Keyword Arguments
  • Variable Scope - Local & Global
  • Lambda Functions
  • *args and **kwargs Basics
  • List Creation & Indexing
  • List Slicing & Filtering Operations
  • List Modification
  • List Comprehension Basics
  • Tuples - Creation & Unpacking
  • Sets - Creation & Operations
  • Dictionary Creation & Access
  • Dictionary Modification & List of Dictionaries
  • Dictionary Iteration & Methods
  • Nested Collections
  • len, type, min, max, sum
  • sorted() and reversed()
  • enumerate() in Loops
  • zip() - Combining Iterables
  • range() Advanced Patterns
  • map() and filter() Basics
  • Mini ETL Workflow
  • Beginner Complete
  • Python function definition
  • Python parameters return
  • Python lambda exercise
  • Python importing modules
  • Python list comprehension
  • Python generators
  • Python collections module
  • Python read file
  • Python CSV
  • Python JSON
  • Python try except
  • Python custom exception
  • Python class creation
  • Python inheritance
  • Python encapsulation
  • Python requests
  • Python JSON parse
  • Python BeautifulSoup
  • Python async await
  • Python aiohttp
  • Python magic methods
  • Python descriptors
  • Python cProfile
  • Python threading
  • Python finance tracker
  • Python web scraper
  • Python REST API
  • Python chat app

PySpark Resource Topics

  • What is PySpark?
  • When to use PySpark
  • Setting up PySpark
  • Spark architecture
  • DataFrames vs RDDs
  • Create DataFrame
  • Show and count
  • Print schema
  • Column names and types
  • First and last rows
  • Describe statistics
  • Check for nulls
  • DataFrame shape
  • Select columns
  • Column alias
  • Select with alias
  • Select with col()
  • Drop columns
  • Select distinct values
  • Computed column select
  • Filter rows
  • Filter with where
  • Filter multiple conditions
  • Filter with isin
  • Filter null values
  • Filter with string match
  • Filter with between
  • Negate a filter
  • Filter after groupBy
  • Limit rows
  • Filter with regex
  • Filter with OR
  • Filter computed column
  • Filter and count
  • Filter with like
  • Order by column
  • Order by multiple columns
  • Sort ascending
  • Sort descending
  • Sort with nulls
  • Sort by expression
  • Top N rows
  • Rank with orderBy
  • Add new column
  • Rename column
  • Add literal column
  • Replace column
  • Add multiple columns
  • Rename multiple columns
  • Reorder columns
  • Add conditional column
  • Count aggregations
  • Sum aggregations
  • Average aggregations
  • Min and Max aggregations
  • Multiple aggregations
  • Statistical aggregations
  • Collection aggregations
  • Advanced aggregation patterns
  • Drop duplicates
  • Handle nulls
  • Fill null values
  • Drop null rows
  • Replace values
  • Cast column type
  • Trim whitespace
  • Remove duplicates by key
  • Clean and Aggregate DataFrame
  • Beginner Complete
  • PySpark inner join
  • PySpark left join
  • PySpark cross join
  • PySpark SQL query
  • PySpark temp view
  • PySpark catalog
  • PySpark UDF
  • PySpark Pandas UDF
  • PySpark vs Pandas
  • PySpark Interview Questions
  • PySpark caching
  • PySpark broadcast join
  • PySpark partitioning
  • PySpark Structured Streaming
  • Delta Lake

PySpark Practice Topics

  • Create DataFrame
  • Show and count
  • Print schema
  • Column names and types
  • First and last rows
  • Describe statistics
  • Check for nulls
  • DataFrame shape
  • Select columns
  • Column alias
  • Select with alias
  • Select with col()
  • Drop columns
  • Select distinct values
  • Computed column select
  • Filter rows
  • Filter with where
  • Filter multiple conditions
  • Filter with isin
  • Filter null values
  • Filter with string match
  • Filter with between
  • Negate a filter
  • Filter after groupBy
  • Limit rows
  • Filter with regex
  • Filter with OR
  • Filter computed column
  • Filter and count
  • Filter with like
  • Order by column
  • Order by multiple columns
  • Sort ascending
  • Sort descending
  • Sort with nulls
  • Sort by expression
  • Top N rows
  • Rank with orderBy
  • Add new column
  • Rename column
  • Add literal column
  • Replace column
  • Add multiple columns
  • Rename multiple columns
  • Reorder columns
  • Add conditional column
  • Count aggregations
  • Sum aggregations
  • Average aggregations
  • Min and Max aggregations
  • Multiple aggregations
  • Statistical aggregations
  • Collection aggregations
  • Advanced aggregation patterns
  • Drop duplicates
  • Handle nulls
  • Fill null values
  • Drop null rows
  • Replace values
  • Cast column type
  • Trim whitespace
  • Remove duplicates by key
  • Clean and Aggregate DataFrame
  • Beginner Complete
  • Inner join
  • Left join
  • Right join
  • Anti join
  • Sum and max
  • Min and avg
  • Multiple agg functions
  • ROW_NUMBER
  • RANK
  • Dense rank
  • Lag and lead
  • Create temp view
  • SQL query
  • SQL aggregation
  • Simple UDF
  • UDF with condition
  • Conditional column
  • String functions
  • Chain transformations
  • Cache DataFrame
  • Broadcast join
  • Repartition and coalesce
  • Explain plan
  • Self join
  • Cross join
  • Multiple condition join
  • Cumulative sum
  • Percent rank
  • NTile buckets
  • Moving average
  • Pivot table
  • Collect list
  • GroupBy having filter
  • Type casting advanced
  • Coalesce nulls
  • Regexp replace
  • Deduplication strategy
  • Null handling strategy
  • Full pipeline

SQL Resource Topics

  • What is SQL
  • When to use SQL
  • Relational databases
  • Tables and schemas
  • Running SQL
  • SQL data types
  • SQL syntax rules
  • SQL vs NoSQL
  • Popular SQL databases
  • How SQL queries execute
  • SELECT basics
  • SELECT specific columns
  • Column aliases
  • DISTINCT
  • Expressions in SELECT
  • Table aliases
  • CASE WHEN in SELECT
  • WHERE clause
  • Comparison operators
  • WHERE with AND OR
  • IN and NOT IN
  • BETWEEN
  • LIKE and wildcards
  • IS NULL and IS NOT NULL
  • ORDER BY
  • ORDER BY multiple columns
  • LIMIT and OFFSET
  • TOP and FETCH FIRST
  • COUNT and COUNT DISTINCT
  • SUM and AVG
  • MIN and MAX
  • GROUP BY
  • HAVING
  • Multiple aggregates
  • INNER JOIN
  • LEFT JOIN
  • RIGHT JOIN
  • FULL JOIN
  • Self join
  • Multiple joins
  • CROSS JOIN
  • JOIN with WHERE
  • UNION
  • UNION ALL
  • INTERSECT
  • EXCEPT
  • INSERT INTO
  • INSERT multiple rows
  • UPDATE
  • UPDATE with WHERE
  • DELETE
  • DELETE with WHERE
  • TRUNCATE vs DELETE
  • CREATE TABLE
  • ALTER TABLE
  • DROP TABLE
  • Join and Aggregate Sales Data
  • Beginner Complete
  • Scalar subquery
  • Subquery in WHERE
  • Subquery in FROM
  • Correlated subquery
  • EXISTS and NOT EXISTS
  • CTE basics
  • Multiple CTEs
  • CTE vs subquery
  • Recursive CTE
  • ROW_NUMBER
  • RANK and DENSE_RANK
  • PARTITION BY
  • LEAD and LAG
  • Running totals with SUM OVER
  • Conditional aggregation
  • FILTER clause
  • ROLLUP
  • CUBE
  • GROUPING SETS
  • UPPER LOWER TRIM
  • SUBSTRING and REPLACE
  • String length and position
  • Date and time basics
  • Date arithmetic
  • EXTRACT and date parts
  • BEGIN COMMIT ROLLBACK
  • SAVEPOINT
  • INSERT INTO SELECT
  • UPDATE with JOIN
  • DELETE with subquery
  • UPSERT and ON CONFLICT
  • Indexes
  • EXPLAIN and query plans
  • Index types
  • Covering indexes
  • Query rewriting
  • Pivot and unpivot
  • Gap and island problems
  • Hierarchical queries
  • JSON functions
  • Full text search
  • First Normal Form
  • Second and Third Normal Form
  • Denormalization trade-offs
  • Primary and foreign keys
  • Constraints and defaults
  • GRANT and REVOKE
  • Roles and privileges
  • Row-level security
  • Views as security layer
  • Percentiles and NTILE
  • FIRST_VALUE and LAST_VALUE
  • Sessionization with window functions
  • Cohort analysis
  • Moving averages

SQL Practice Topics

  • SELECT basics
  • SELECT specific columns
  • Column aliases
  • DISTINCT
  • Expressions in SELECT
  • Table aliases
  • CASE WHEN in SELECT
  • WHERE clause
  • Comparison operators
  • WHERE with AND OR
  • IN and NOT IN
  • BETWEEN
  • LIKE and wildcards
  • IS NULL and IS NOT NULL
  • ORDER BY
  • ORDER BY multiple columns
  • LIMIT and OFFSET
  • TOP and FETCH FIRST
  • COUNT and COUNT DISTINCT
  • SUM and AVG
  • MIN and MAX
  • GROUP BY
  • HAVING
  • Multiple aggregates
  • INNER JOIN
  • LEFT JOIN
  • RIGHT JOIN
  • FULL JOIN
  • Self join
  • Multiple joins
  • CROSS JOIN
  • JOIN with WHERE
  • UNION
  • UNION ALL
  • INTERSECT
  • EXCEPT
  • INSERT INTO
  • INSERT multiple rows
  • UPDATE
  • UPDATE with WHERE
  • DELETE
  • DELETE with WHERE
  • TRUNCATE vs DELETE
  • CREATE TABLE
  • ALTER TABLE
  • DROP TABLE
  • Join and Aggregate Sales Data
  • Beginner Complete
  • Scalar subquery
  • Subquery in WHERE
  • Subquery in FROM
  • Correlated subquery
  • EXISTS and NOT EXISTS
  • CTE basics
  • Multiple CTEs
  • CTE vs subquery
  • Recursive CTE
  • ROW_NUMBER
  • RANK and DENSE_RANK
  • PARTITION BY
  • LEAD and LAG
  • Running totals with SUM OVER
  • Conditional aggregation
  • FILTER clause
  • ROLLUP
  • CUBE
  • GROUPING SETS
  • UPPER LOWER TRIM
  • SUBSTRING and REPLACE
  • String length and position
  • Date and time basics
  • Date arithmetic
  • EXTRACT and date parts
  • BEGIN COMMIT ROLLBACK
  • SAVEPOINT
  • INSERT INTO SELECT
  • UPDATE with JOIN
  • DELETE with subquery
  • UPSERT and ON CONFLICT
  • Indexes
  • EXPLAIN and query plans
  • Index types
  • Covering indexes
  • Query rewriting
  • Pivot and unpivot
  • Gap and island problems
  • Hierarchical queries
  • JSON functions
  • Full text search
  • First Normal Form
  • Second and Third Normal Form
  • Denormalization trade-offs
  • Primary and foreign keys
  • Constraints and defaults
  • GRANT and REVOKE
  • Roles and privileges
  • Row-level security
  • Views as security layer
  • Percentiles and NTILE
  • FIRST_VALUE and LAST_VALUE
  • Sessionization with window functions
  • Cohort analysis
  • Moving averages