Venkatesh Gopinath Bogem

Data Engineer | Machine Learning Enthusiast | Cloud-native Solutions Architect

I’m a passionate Data Engineer and Machine Learning practitioner dedicated to transforming raw data into impactful insights. With expertise across cloud platforms, scalable data pipelines, and model deployment, I strive to solve complex problems through data-driven solutions that scale. My work blends engineering precision with a research-driven mindset, bridging data systems and AI in production environments.

📬 Contact

🛠 Technical Stack

💻 Languages: Python, R, Java, SQL, NoSQL, Scala, Bash, JavaScript
☁️ Cloud: AWS, GCP, Azure
📦 Big Data: Apache Spark, Hadoop, Kafka, Flink, Hive, Airflow, PySpark
🗄 Databases: MySQL, PostgreSQL, MongoDB, Snowflake, Cassandra, DynamoDB
📊 Visualization: Tableau, PowerBI, Looker, QuickSight, Matplotlib, D3.js
🤖 ML & NLP: TensorFlow, PyTorch, Keras, Scikit-learn, NLTK, SpaCy
⚙️ DevOps: Docker, Kubernetes, Jenkins, Terraform, GitHub Actions
🧪 Tools: MLflow, Flask, FastAPI, Django, Pandas, Jupyter

💼 Experience

Business Intelligence Engineer | Amazon (Jan 2025 – Present)

  • Designed and executed ETL pipelines using Datanet and advanced SQL to automate KPI reporting.
  • Built multiple AWS QuickSight dashboards for end-to-end business performance tracking.

Data Engineer | Abecedarian (May 2024 – Dec 2024)

  • Engineered scalable GenAI-enabled pipelines for automated ingestion, transformation, and synthetic data generation.
  • Deployed custom data models using AWS/GCP infrastructure and ensured compliance and security at scale.

Data Engineer / Analyst | Datics Inc. (May 2023 – Dec 2023)

  • Optimized ETL pipelines using AWS Glue, S3, Lambda, and Airflow—cutting processing time by 50%.
  • Built dashboards (PowerBI, QuickSight) to support real-time decision-making across teams.

Data Engineer I | ACCK Solutions (Jun 2019 – Dec 2021)

  • Processed over 1M records/day using PySpark and optimized data pipelines across Ads, CRM, and Analytics tools.
  • Built dimensional models (Star/Snowflake schema) and interactive dashboards in Tableau and PowerBI.

Machine Learning Assistant | Northeastern University (Sept 2022 – May 2024)

  • Automated grading systems using Python and GitHub Actions; mentored students on ML fundamentals and model development.

🚀 Featured Projects

Reddit Data Pipeline 🔗

Automated a sentiment analysis pipeline using Apache Airflow and AWS (S3, Glue, Redshift, Athena).

MLOps for Emotion Detection 🔗

Built an end-to-end MLOps pipeline on GCP with Airflow, MLflow, and Docker—reduced deployment time by 40%.

Product Review Sentiment Analysis 🔗

Compared VADER, RoBERTa, and LSTM models for review sentiment classification—RoBERTa achieved 85% accuracy.