E xo-Explorer, an AI-powered web app that cleans, engineers, and analyzes NASA's extensive exo-planetary data to predict habitability scores. Built with GPT-4o, LangGraph, and H2O AutoML, it automates scientific discovery, combining rule-based filtering with probabilistic modeling to surface potentially life-supporting worlds across the galaxy.
π§ Local AI Architecture
- Streamlit: Frontend interface for interactive exploration
- LangGraph + GPT-4o: Orchestrates a modular data-cleaning agent with memory and human-in-the-loop review
- Pandas + SQLite: Stores planetary data and model outputs locally
- H2O AutoML: Trains models to predict exoplanet habitability with automated feature selection and tuning
- OpenAI + LangChain: Powers agent-based code generation and reasoning
- Local CSV & SQL Storage: Persist cleaned datasets and predictions for download or analysis
Backend
Python
SQL Alchemy
Frontend
Streamlit
AI/ML
GPT-4o
AI Data Science Team
LangChain
LangGraph
H2O AutoML
CI / CD
GitHub Actions
Letβs Connect
If youβre a builder, dreamer, or data explorer, reach out. Whether it's a conversation, a collaboration, or just curiosity, Iβd love to connect. You can find me on LinkedIn or email me directly.
Step 1: Open the App
Users first visualize the dataset in a Streamlit dashboard. The app connects to the NASA Exoplanet Archive and it ready for H2O AutoML predictive analysis.
Step 2: Click on Train Model
The user then clicks on Start Training, which triggers the H2O AutoML pipeline. This pipeline automatically cleans, engineers, and trains a model on the dataset using LangGraph and GPT-4o.
Step 3: AutoML Leaderboard generated
The pipeline runs a series of data cleaning and feature engineering steps, generating a leaderboard of models. The best-performing model is selected based on metrics like accuracy and F1 score.
Step 4: Probability-Habitability Column now Fulfilled
The best model's predictions are saved back to the SQLite database, enriching each planet record with a habitability score. This allows users to filter and explore potentially habitable worlds.
Step 5: Explore The Exoplanet
Explore the actual worlds with the highest habitability scores. Visit the NASA Exoplanet Archive to view the full details of each planet, including its characteristics and potenital for supporting life.
ExoExplorer/
β
βββ Exo_Explorer.py # Streamlit app for exoplanet habitability
β
βββ cleaning_agent/ # LangGraph-powered data cleaning agent
β βββ agent.py # Main agent logic and pipeline
β βββ utils.py # Helper functions
β
βββ dataset/ # NASA exoplanet CSV dataset
βββ database/ # SQLite database for storing results
βββ models/ # Saved H2O AutoML models
βββ img/ # Images or diagrams (optional)
β
βββ templates/ # Prompt templates for agent instructions
βββ parsers/ # Output parsers for GPT responses
βββ tools/ # Shared tools/utilities
βββ utils/ # General-purpose utility scripts
β
βββ README.md # Project overview and instructions
βββ requirements.txt # Core Python dependencies
βββ additional-requirements.txt # Extra deps for experiments or modules
βββ .gitignore # Ignore Python caches, models, etc.
git clone https://github.com/LannonTheCannon/gen_ai_bootcamp.git cd Week3 python3 -m venv venv source venv/bin/activate pip install -r requirements.txt streamlit run Exo_Explorer.py