This project is designed to showcase a WebSocket server capable of handling data streaming from over 50,000 clients simultaneously. It simulates a scenario where clients continuously send meter reading data to the server.
Tip
In theory, a server can handle up to 65,536 sockets per single IP address
-
Scalable WebSocket server for handling multiple client connections using.
- multiple cores of processor/ CPU
- Horizontal Pod Autoscaler (HPA) in kubernetes
- Including pub/sub to handle unpredictable scalability
-
Client scripts to simulate data streaming from numerous sources and mock clients can be tested using artillery.io.
-
Saving load on machine by batch processing of data with a flexibility of batch processing by size of buffer or sampling frequency
-
handling the race condition of batch processing.
-
Data parsing and handling from Excel files.
-
Development environment setup for real-time TypeScript compilation (For the bonus point 😉)
-
For API endpoint, to fetch data at better pace, created a balance of indexing data while storing.
-
used OLAP method to store and fetch data from DB.
Improvements (did not include in current submission as the clients connection and load shed was handled by artillery)
-
Load Shedding of rejected/ idle websocket connections on server and client ends
-
Restoring connections by storing the connection config into zookeeper.
-
Increasing the waiting time between retries to maximum backoff time.
-
Making backoff time random, so that not all clients reconnect simultaneously.
-
Manage stream resumes after reconnection of clients by maintaining cache/ persistant storage.
-
Managing frequency of heartbeats, to keep track of idle connection, so that longer idle clients create a new connection when needed rather than keeping the connection alive.
-
Use of master and slave database, to ensure faster accessibility of data from analytical database that offers only read only operation.
-
Use of multiple slave databases to ensure load balancing on database. Bulk write into slave database and eventually, written to master database.
- Node.js
- TypeScript
- kubernetes
- minikube
- ingress
- kafka
- zookeeper
- WebSocket (using the
socket.io
library) - Excel File Handling (using the
xlsx
library)
Note
FOR DEPLOYING ON KUBERNETES USING MINIKUBE, an article is coming soon..
server/
: Contains WebSocket server implementation and express server.client/
: Contains WebSocket client implementation and script for sending data from Excel.kubernetes/
: Contains services, deployments, HPA, pvc for kafka service, websocket service (created in this project) and zookeeperFiles/
: Contains sample data files.controllers/
: Contains controller of APIs.helpers/
: Contains batch processing of saving data to DB and helper for API controllerslogic/
: Contains query builderroutes/
: Contains routes of APIsstore/
: Contains DB connection, kafka producer and consumer, Enums, interfaces, schema, models and data validators
Make sure you have Node.js (version 16 or later) installed on your system.
- Clone the Repository
git clone /~https://github.com/SatyamAnand98/pulseEnergy.git
cd pulse-energy
- Install Dependencies
npm install
- Build the Project
npm run build
- Start the Server
npm run start-server
- Run the Client Script to start manual client of websocket (in a separate terminal)
npm run start-client
- Send Excel Data to Server (in a separate terminal)
npm run start-sendData
- Start express server
npm run start
- Load Testing using Artillery.io
npm run loadTesting
To run the server, client, or sendData script in development mode with real-time TypeScript compilation:
- Start the Server
npm run dev-server
- Run the Client Script to start manual client of websocket (in a separate terminal)
npm run dev-client
- Send Excel Data to Server (in a separate terminal)
npm run dev-sendData
- Start express server
npm run dev
- Load Testing using Artillery.io
npm run loadTesting
Important
To view API documentation for the required enpoints click here