This project is a real-time streaming, storage, and visualization system for sensor data. It leverages Kafka for data streaming, InfluxDB (time-series database) and MongoDB (NoSQL database) for storage, and Grafana for visualization. Sensor data is captured using the SensorServer app and streamed to Kafka. Consumer APIs for each database handle data ingestion and storage, with the system being fully containerized and orchestrated using Docker Compose.
This diagram provides an overview of the system's components and data flow.
- Data is collected from smartphone sensors (Accelerometer, Gyroscope, Magnetic field, Gravity) using the SensorServer app.
- Sensor data is transmitted via WebSocket from the phone to a Kafka producer.
- Kafka acts as the message broker, streaming data to designated topics for InfluxDB and MongoDB.
- The system supports two databases:
- InfluxDB: Specialized for time-series data storage and includes native visualization capabilities.
- MongoDB: Stores unstructured sensor data flexibly.
- Flask-based APIs consume data from Kafka and store it in respective databases:
- InfluxDB Consumer API:
- Handles data sent to InfluxDB.
- MongoDB Consumer API:
- Handles data sent to MongoDB.
- InfluxDB Consumer API:
- Flask-SocketIO is used to send log messages to the frontend, such as:
- Starting/Stopping the pipeline.
- Logs of data written to the database.
- Both APIs provide a browser-based interface for pipeline management:
- Start/stop data pipelines.
- Configure database credentials and settings.
- InfluxDB's built-in visualization tools offer quick insights into time-series data.
- Grafana supports advanced dashboards, integrating both MongoDB and InfluxDB for custom visualization.
- The system is containerized, with all services defined in a
docker-compose.yml
file. - APIs are pushed to Docker Hub, allowing deployment across various environments.
APIs_Requests/
influxdb_consumer_api.py # Scripts to interact with the InfluxDB consumer API via `requests` library
mongodb_consumer_api.py # Scripts to interact with the MongoDB consumer API via `requests` library
Query_DBs/
query_influxdb.py # Script for querying InfluxDB
query_mongodb.py # Script for querying MongoDB
assets/
GIF.gif # Real-Time Visualization of sensor data
UML_Diagram.png # System architecture diagram
Grafana_Visualization.png # Grafana dashboard visualization
InfluxDB_Visualization.png # InfluxDB dashboard visualization
screenshot_influxdb_consumer_app.png # Screenshot of the InfluxDB Consumer App
screenshot_mongodb_consumer_app.png # Screenshot of the MongoDB Consumer App
influxdb_consumer_api/
templates/ # HTML templates for InfluxDB Consumer API interface
Dockerfile # Dockerfile for the API
app.py # Flask API for InfluxDB Consumer
producer.py # Kafka producer for InfluxDB
requirements.txt # Dependencies for InfluxDB Consumer API
mongodb_consumer_api/
templates/ # HTML templates for MongoDB Consumer API interface
Dockerfile # Dockerfile for the API
app.py # Flask API for MongoDB Consumer
producer.py # Kafka producer for MongoDB
requirements.txt # Dependencies for MongoDB Consumer API
RealTime_visual_sensors_data.py # Script for visualizing sensor data in real time
docker-compose.yml # Docker Compose file for orchestration
requirements.txt # Project dependencies
-
SensorServer App:
- Install and configure the SensorServer app on your smartphone(Android Only) to stream sensor data via WebSocket.
-
Docker:
- Ensure that Docker is installed and running on your system. Install Docker from here.
-
Python:
- This project uses Python version 3.12.6. Install Python from here.
Clone the repository to your local machine:
git clone /~https://github.com/TouradBaba/sensors_data_streaming.git
cd sensors_data_streaming
It's recommended to create a virtual environment to manage the Python dependencies for the project:
python -m venv myenv
Activate the virtual environment:
- On Windows:
myvenv\Scripts\activate
- On macOS/Linux:
source myenv/bin/activate
Install the required Python dependencies:
pip install -r requirements.txt
Make sure Docker is installed and running on your system.
Run the following command to launch the system:
docker-compose up --build
- Access the InfluxDB dashboard at http://localhost:8086 after starting Docker Compose.
- Default credentials:
- Username:
admin
- Password:
admin12345
- Username:
- Create an API Token:
- Navigate to InfluxDB CLI → Initialize Client → Copy the Token.
- Use this token in the InfluxDB Consumer API configuration.
- MongoDB uses the default database (
sensor_data
) and collection (sensor_data
). - No additional configuration is required.
- APIs are accessible via localhost as configured in the
docker-compose.yml
file.- InfluxDB Consumer API:
- Access the InfluxDB Consumer App at http://localhost:5000.
- MongoDB Consumer API:
- Access the MongoDB Consumer App at http://localhost:5001.
- InfluxDB Consumer API:
- You can also interact with the APIs using scripts located in the
APIs_Requests/
directory.
- Access Grafana at http://localhost:3000.
- Add data sources:
- InfluxDB: Configure it with the credentials and token created earlier.
- MongoDB: Use the MongoDB Data Source Plugin.
- Create dashboards with custom visualizations.
-
Starting the Pipeline:
- When a pipeline is started in a given consumer API (InfluxDB or MongoDB), the API first calls the Kafka Producer.
-
Data Collection:
- The Kafka Producer retrieves real-time sensor data from the SensorServer app, which streams data via WebSocket.
-
Data Streaming:
- The data from the phone's sensors is sent to the corresponding Kafka topic (
sensor-data-influxdb
for InfluxDB orsensor-data-mongodb
for MongoDB).
- The data from the phone's sensors is sent to the corresponding Kafka topic (
-
Data Storage:
- The API then consumes data from the Kafka topic. The data is processed and then stored in the respective database (InfluxDB or MongoDB).
The Query_DBs and APIs_Requests folders contain scripts that enable interaction with the databases and consumer APIs:
-
Query_DBs:
- The scripts
query_influxdb.py
andquery_mongodb.py
allow you to query the data stored in InfluxDB and MongoDB, respectively.
- The scripts
-
APIs_Requests:
- The scripts
influxdb_consumer_api.py
andmongodb_consumer_api.py
enable you to interact with the Flask-based APIs to start/stop the pipeline.
- The scripts
The RealTime_visual_sensors_data.py is for visualizing sensor data in real time. It allows you to directly monitor the sensor data streamed from the SensorServer.
I would like to acknowledge Umer Farooq for developing the SensorServer app, which was used to collect smartphone sensor data.