Data Warehousing with BigQuery: Storage Design, Query Optimization, and Administration (DWBQ-SDQA) – Outline
Detailed Course Outline
Module 01 BigQuery Architecture Fundamentals
- BigQuery Core Infrastructure
- BigQuery Storage
- BigQuery Query Processing
- BigQuery Data Shuffling
- Explain the benefits of columnar storage.
- Understand how BigQuery processes data.
- Explore the basics of BigQuery’s shuffling service to improve query efficiency.
Module 02 Storage and Schema Optimizations
- BigQuery Storage
- Partitioning and Clustering
- Nested and Repeated Fields
- ARRAY and STRUCT syntax
- Best Practices
- Compare the performance of different schemas (snowflake, denormalized, and nested and repeated fields).
- Partition and cluster data for better performance
- Improve schema design using nested and repeated fields.
- Describe additional best practices such as table and partition expiration
Module 03 Ingesting Data
- Data Ingestion Options
- Batch Ingestion
- Streaming Ingestion
- Legacy Streaming API
- BigQuery Storage Write API
- Query Materialization
- Query External Data Sources
- Data Transfer Service
- Ingest batch and streaming data.
- Query external data sources.
- Schedule data transfers.
- Understand how to use Storage Write API.
Module 04 Changing Data
- Managing Change in Data Warehouses
- Handling Slowly Changing Dimensions (SCD)
- DML statements
- DML Best Practices and Common Issues
- Write DML statements.
- Address common DML performance problems and bottlenecks.
- Identify slowly changing dimensions (SCD) in your data and make updates.
Module 05 Improving Read Performance
- BigQuery’s Cache
- Materialized Views
- BI Engine
- High Throughput Reads
- BigQuery Storage Read API
- Explore BigQuery’s cache.
- Create materialized views.
- Work with BI Engine to accelerate your SQL queries.
- Use the Storage Read API for fast access to BigQuery-managed storage.
- Explain the caveats of using external data sources.
Module 06 Optimizing and Troubleshooting Queries
- Simple Query Execution
- SELECTs and Aggregation
- JOINs and Skewed JOINs
- Filtering and Ordering
- Best Practices for Functions
- Interpret BigQuery execution details and the query plan.
- Optimize query performance by using suggested methods for SQL statements and clauses.
- Demonstrate best practices for functions in business use cases.
Module 07 Workload Management and Pricing
- BigQuery Slots
- Pricing Models and Estimates
- Slot Reservations
- Controlling Costs
- Define a BigQuery slot.
- Explain pricing models and pricing estimations (BigQuery UI, bq dry_run, jobs API).
- Understand slot reservations, commitments, and assignments.
- Identify best practices to control costs.
Module 08 Logging and Monitoring
- Cloud Monitoring
- BigQuery Admin Panel
- Cloud Audit Logs
- Query Path and Common Errors
- Use Cloud Monitoring to view BigQuery metrics.
- Explore the BigQuery admin panel.
- Use Cloud Audit logs.
- Work with INFORMATION_SCHEMA tables to get insights for your BigQuery entities.
Module 09 Security in BigQuery
- Secure Resources with IAM
- Authorized Views
- Secure Data with Classification
- Data Discovery and Governance
- Explore data discovery using Data Catalog.
- Discuss data governance using DLP API and Data Catalog.
- Create IAM policies (e.g., authorized views) to secure resources.
- Secure data with classifications (e.g., row-level policies).
- Understand how BigQuery uses encryption.
Module 10 Automating Workloads
- Scheduling Queries
- Stored Procedures
- Integration with Big Data Products
- Schedule queries.
- Use scripting and stored procedures to build custom transformations.
- Describe how to integrate BigQuery workloads with other Google Cloud big data products.
Module 11 Machine Learning in BigQuery
- Introduction to BigQuery ML
- How to Make Predictions with BigQuery ML
- How to Build and Deploy a Recommendation System with BigQuery ML
- How to Build and Deploy a Demand Forecasting Solution with BigQuery ML
- Time-Series Models with BigQuery ML
- BigQuery ML Explainability
- Describe some of the different applications of BigQuery ML.
- Build and deploy several categories of machine learning models with BigQuery ML.
- Use AutoML Tables to solve high-value business problems.