Data Analysis Examples
Use zo for quick data analysis, processing, and insights from the command line.
CSV Analysis
Basic Statistics
cat sales.csv | zo 'Calculate summary statistics:
- Total revenue
- Average order value
- Number of transactions
- Revenue by category'Find Patterns
cat customer_data.csv | zo 'Analyze this customer data:
- Find patterns in purchase behavior
- Identify high-value customers
- Suggest segmentation strategy'Data Quality Check
cat data.csv | zo 'Check data quality:
- Missing values per column
- Outliers
- Duplicate records
- Data type issues'Compare Datasets
zo @dataset1.csv @dataset2.csv 'Compare these datasets:
- Schema differences
- Value distribution changes
- New/removed records'JSON Processing
Extract Data
curl -s https://1.0.0.1/cdn-cgi/trace | zo 'Extract:
- IP
- Location
- TLS Version'Transform Format
cat data.json | zo 'Convert to CSV with columns: id, name, email, created_at'Query Data
cat large_api_response.json | zo 'Find all users with:
- More than 100 followers
- Created in 2024
- Located in California'Generate jq Command
cat sample.json | zo 'Generate jq command to extract all email addresses from nested user objects'Log Analysis
Error Summary
tail -1000 app.log | zo 'Summarize errors:
- Most common error types
- Error frequency
- Affected components
- Suggested fixes'Pattern Detection
grep ERROR /var/log/nginx/error.log | zo 'Find patterns:
- Time of day with most errors
- Common error causes
- Suspicious IP addresses
- Recommended actions'Performance Analysis
cat slow_queries.log | zo 'Analyze slow queries:
- Slowest queries
- Common patterns in slow queries
- Optimization suggestions
- Index recommendations'Security Analysis
cat access.log | zo 'Security analysis:
- Suspicious request patterns
- Potential attack attempts
- Unusual user agents
- Recommended security measures'SQL Generation
Simple Query
zo @schema.sql 'Generate query to:
- Find top 10 customers by total purchases
- Include customer name, email, purchase count, and total spent
- Order by total spent descending'Complex Query
zo @schema.sql 'Generate query with:
- Join customers, orders, and products
- Filter for 2024 purchases
- Group by product category
- Calculate revenue per category
- Include percentage of total'Schema Design
zo 'Generate SQL schema for:
- E-commerce platform
- Users, products, orders, reviews
- Include foreign keys, indexes
- Add constraints and defaults'Data Transformation
Format Conversion
# CSV to JSON
cat data.csv | zo 'Convert to JSON array'
# JSON to CSV
cat data.json | zo 'Convert to CSV'
# XML to JSON
cat data.xml | zo 'Convert to JSON'Data Cleaning
cat messy_data.csv | zo 'Clean this data:
- Remove duplicates
- Fix date formats to YYYY-MM-DD
- Standardize phone numbers
- Fill missing values with median'System Diagnostics
Memory Analysis
ps aux --sort=-%mem | head -20 | zo 'Memory analysis:
- Highest memory consumers
- Potential memory leaks
- Processes to investigate
- Recommended actions'Disk Usage
du -sh * | sort -hr | head -20 | zo 'Disk usage analysis:
- Largest directories
- What can be cleaned
- Suggested optimizations'Network Analysis
netstat -tuln | zo 'Network analysis:
- Active connections
- Listening ports
- Security concerns
- Recommendations'Process Analysis
ps aux | zo 'Analyze running processes:
- Resource-intensive processes
- Zombie processes
- Unusual processes
- Recommendations'API Data Analysis
Rate Limit Analysis
cat api_logs.txt | zo 'Analyze API usage:
- Requests per endpoint
- Peak usage times
- Users hitting rate limits
- Capacity recommendations'Response Time Analysis
cat api_metrics.json | zo 'Analyze response times:
- Slowest endpoints
- P95, P99 latencies
- Performance trends
- Optimization targets'Error Rate Analysis
cat api_errors.log | zo 'Error rate analysis:
- Error frequency by endpoint
- Most common error types
- Client vs server errors
- Suggested fixes'Data Visualization Suggestions
Chart Recommendations
cat sales_data.csv | zo 'Recommend visualizations:
- Best chart types for this data
- Key metrics to highlight
- Interactive features to include
- Color scheme suggestions'Python Plotting Code
cat data.csv | zo /coder 'Generate Python code to:
- Load this CSV
- Create 3 meaningful visualizations
- Use matplotlib or seaborn
- Include titles and labels'Dashboard Design
cat metrics.json | zo 'Design dashboard:
- Key metrics to display
- Layout suggestions
- Chart types
- Real-time vs historical views'Statistical Analysis
Descriptive Statistics
cat survey_results.csv | zo 'Calculate:
- Mean, median, mode for each numeric column
- Standard deviation
- Quartiles
- Correlation matrix'Hypothesis Testing
cat experiment_data.csv | zo 'Statistical analysis:
- Compare control vs treatment groups
- Calculate p-value
- Determine statistical significance
- Interpret results'Regression Analysis
cat sales_data.csv | zo 'Regression analysis:
- Predict sales based on marketing spend
- Calculate R-squared
- Identify significant variables
- Interpret coefficients'Data Pipeline Examples
ETL Pipeline
#!/bin/bash
# Extract
curl -s https://api.example.com/data > raw_data.json
# Transform
cat raw_data.json | zo 'Transform to CSV with columns: id, name, amount, date' > clean_data.csv
# Load (generate SQL)
zo @schema.sql @clean_data.csv 'Generate INSERT statements'Data Quality Pipeline
#!/bin/bash
# Check quality
issues=$(cat data.csv | zo 'List data quality issues as JSON')
# Fix issues
echo "$issues" | zo 'Generate Python script to fix these data quality issues'Advanced Analysis
Time Series Analysis
cat time_series.csv | zo 'Time series analysis:
- Identify trends
- Detect seasonality
- Find anomalies
- Forecast next 30 days
- Confidence intervals'Cohort Analysis
cat user_cohorts.csv | zo 'Cohort analysis:
- Retention rates by cohort
- Cohort size trends
- Revenue per cohort
- Churn patterns
- Recommendations'Shell Functions for Data Analysis
Add to ~/.bashrc or ~/.zshrc:
# Quick CSV stats
csvstats() {
cat "$1" | zo 'Summary statistics for all numeric columns'
}
# Find correlations
csvcorr() {
cat "$1" | zo 'Correlation matrix for numeric columns. Highlight strong correlations.'
}
# Data quality check
datacheck() {
cat "$1" | zo 'Data quality report:
- Missing values
- Outliers
- Duplicates
- Type mismatches'
}
# Generate SQL
sqlgen() {
zo @schema.sql "$*"
}
# API analysis
apistat() {
cat "$1" | zo 'API usage statistics:
- Request distribution
- Error rates
- Response times
- Top consumers'
}Tips for Data Analysis
Provide Context
# ❌ Too vague
cat data.csv | zo 'Analyze this'
# ✅ Specific
cat sales_data.csv | zo 'Analyze 2024 sales data:
- Monthly revenue trends
- Top performing products
- Regional performance
- YoY growth rate'Sample Large Datasets
# First 1000 rows for quick analysis
head -1000 large_file.csv | zo 'Quick analysis'
# Random sample
shuf -n 1000 large_file.csv | zo 'Sample analysis'Iterate with Chat Mode
zo --chat @data.csv 'Let us analyze this sales data'
> What are the top 10 products?
> Show me revenue trends
> Any seasonal patterns?
> exitUse Appropriate Models
# Quick analysis - fast model
cat data.csv | zo /flash 'Quick summary'
# Deep analysis - smart model
cat data.csv | zo /opus 'Comprehensive statistical analysis'