Skip to content

Commit

Permalink
Merge pull request #305 from WeDataSphere/dev-1.0.0
Browse files Browse the repository at this point in the history
LGTM.
  • Loading branch information
FinalTarget authored Jun 21, 2022
2 parents 41a4ac8 + 997a823 commit 9e04ddd
Show file tree
Hide file tree
Showing 214 changed files with 7,734 additions and 3,053 deletions.
63 changes: 31 additions & 32 deletions README-ZH.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,67 @@
# Exchangis

[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)

[English](README.md) | 中文

## 项目简介
## 介绍

Exchangis 是微众银行大数据平台 WeDataSphere 自研的数据交换工具,支持异构数据源之间的结构化和非结构化数据传输同步。
Exchangis 1.0.0 是微众银行大数据平台 WeDataSphere 与社区用户共同研发的的新版数据交换工具,支持异构数据源之间的结构化和非结构化数据传输同步。

Exchangis 抽象了一套统一的数据源和同步作业定义插件,允许用户快速接入新的数据源,并只需在数据库中简单配置即可在页面中使用。

基于插件化的框架设计,及计算中间件 [Linkis](/~https://github.com/apache/incubator-linkis),Exchangis 可快速集成对接 Linkis 已集成的数据同步引擎,将 Exchangis 的同步作业转换成 Linkis 数据同步引擎的数据同步作业。

借助于 [Linkis](/~https://github.com/apache/incubator-linkis) 计算中间件的连接、复用与简化能力,Exchangis 天生便具备了高并发、高可用、多租户隔离和资源管控的金融级数据同步能力。

### 界面预览

![image](https://user-images.githubusercontent.com/27387830/171488936-2cea3ee9-4ef7-4309-93e1-e3b697bd3be1.png)

## 核心特点

### 1. 数据源管理
### 1. 轻量化的数据源管理

- 基于 Linkis DataSource,抽象了底层数据源在 Exchangis 作为一个同步作业的 Source 和 Sink 所必须的所有能力。只需简单配置即可完成一个数据源的创建。

- 特别数据源版本发布管理功能,支持历史版本数据源回滚,一键发布无需再次配置历史数据源。

基于 Linkis DataSource,抽象了底层数据源在 Exchangis 作为一个同步作业的 Source 和 Sink 所必须的所有能力。

- **多传输引擎支持**
传输引擎可横向扩展;
当前版本完整聚合了离线批量引擎DataX、部分聚合了大数据批量导数引擎SQOOP
### 2. 高稳定,快响应的数据同步任务执行

- **近实时任务管控**
快速抓取传输任务日志以及传输速率等信息,实时关闭任务;
可根据带宽状况对任务进行动态限流
快速抓取传输任务日志以及传输速率等信息,对多任务包括CPU使用、内存使用、数据同步记录等各项指标进行监控展示,支持实时关闭任务;

- **支持无结构化传输**
DataX框架改造,单独构建二进制流快速通道,适用于无数据转换的纯数据同步场景。
- **任务高并发传输**
多任务并发执行,并且支持复制子任务,实时展示每个任务的状态,多租户执行功能有效避免执行过程中任务彼此影响进行;

- **任务状态自检**
监控长时间运行的任务和状态异常任务,及时释放占用的资源并发出告警。

## 与现有的系统的对比
对现有的一些数据交换工具和平台的对比:

| 功能模组 | 描述 | Exchangis | DataX | Sqoop | DataLink | DBus |
| :----: | :----: |-------|-------|-------|-------|-------|
| UI | 集成便捷的管理界面和监控窗口| 已集成 ||| 已集成 |已集成 |
| 安装部署 | 部署难易程度和第三方依赖 | 一键部署,无依赖 | 无依赖 | 依赖Hadoop环境 | 依赖Zookeeper | 依赖大量第三方组件 |
| 数据权限管理| 多租户权限配置和数据源权限管控 | 支持 | 不支持 | 不支持 | 不支持 | 支持 |
| |动态限流传输 | 支持 | 部分支持,无法动态调整 | 部分支持,无法动态调整| 支持 | 支持,借助Kafka |
| 数据传输| 无结构数据二进制传输 | 支持,快速通道 | 不支持 | 不支持 | 不支持,都是记录 | 不支持,需要转化为统一消息格式|
| | 嵌入处理代码 | 支持,动态编译 | 不支持 | 不支持 | 不支持 | 部分支持 |
| | 传输断点恢复 | 支持(未开源) | 不支持,只能重试 | 不支持,只能重试 | 支持 | 支持 |
| 服务高可用 | 服务多点,故障不影响使用| 应用高可用,传输单点(分布式架构规划中) | 单点服务(开源版本) | 传输多点 | 应用、传输高可用 | 应用、传输高可用 |
| 系统管理 | 节点、资源管理 | 支持 | 不支持 | 不支持 | 支持 | 支持 |
监控长时间运行的任务和状态异常任务,中止任务并及时释放占用的资源。


### 3. 与DSS工作流打通,一站式大数据开发的门户

- 实现DSS AppConn包括一级 SSO 规范,二级组织结构规范,三级开发流程规范在内的三级规范;

- 作为DSS工作流的数据交换节点,是整个工作流链路中的门户流程,为后续的工作流节点运行提供稳固的数据基础;

## 整体设计

### 架构设计

![架构设计](../../../images/zh_CN/ch1/architecture.png)
![架构设计](https://user-images.githubusercontent.com/27387830/173026793-f1475803-9f85-4478-b566-1ad1d002cd8a.png)


## 相关文档
[安装部署文档](exchangis_deploy_cn.md)
[用户手册](exchangis_user_manual_cn.md)
[安装部署文档](/~https://github.com/WeDataSphere/Exchangis/blob/dev-1.0.0-rc/docs/zh_CN/ch1/exchangis_deploy_cn.md)
[用户手册](/~https://github.com/WeDataSphere/Exchangis/blob/dev-1.0.0-rc/docs/zh_CN/ch1/exchangis_user_manual_cn.md)

## 交流贡献

如果您想得到最快的响应,请给我们提 issue,或者扫码进群:

![communication](../../../images/communication.png)
![communication](images/zh_CN/ch1/communication.png)

## License

Exchangis is under the Apache 2.0 License. See the [License](../../../LICENSE) file for details.
Exchangis is under the Apache 2.0 License. See the [License](../../../LICENSE) file for details.
81 changes: 43 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,60 +3,65 @@
English | [中文](README-ZH.md)

## Introduction
Exchangis is a lightweight,highly extensible data exchange platform that supports data transmission between structured and unstructured heterogeneous data sources. On the application layer, it has business features such as data permission management and control, high availability of node services and multi-tenant resource isolation. On the data layer, it also has architectural characteristics such as diversified transmission architecture, module plug-in and low coupling of components.

Exchnagis's transmission and exchange capabilities depend on its underlying aggregated transmission engines. It defines a unified parameter model for various data sources on the top layer. It maps and configures the parameter model for each transmission engine, and then converts it into the engine's input model. Each type of engine will add Exchangis features, and the enhancement of certain engine features will improve the Exchangis features. Exchangis's default engine aggregated and enhanced is Alibaba's DataX transmission engine.
Exchangis 1.0.0 is a new version of data exchange tool jointly developed by WeDataSphere, a big data platform of WeBank, and community users, which supports the synchronization of structured and unstructured data transmission between heterogeneous data sources.

## Features
- **Data Source Management**
Share your own data source in a bound project;
Set the external authority of the data source to control the inflow and outflow of data。
Exchangis abstracts a unified set of data source and synchronization job definition plugins, allowing users to quickly access new data sources and use them on pages with simple configuration in the database.

- **Muti-transport Engine Support**
Transmission engine scales horizontally;
The current version fully aggregates the offline batch engine DataX and partially aggregates the big data batch derivative engine SQOOP
Based on the plugin framework design and the computing middleware [Linkis](/~https://github.com/apache/incubator-Linkis), Exchangis can quickly connect to the data synchronization engine in Linkis, and convert the data synchronization job of Exchangis into the job of Linkis.

- **Near Real-time Task Control**
Quickly capture the transmission task log, transmission rate and other information, close the task in real time;
Dynamically limit transmission rate based on bandwidth
With the help of [Linkis](/~https://github.com/apache/incubator-linkis) computing middleware's connection, reuse and simplification capabilities, Exchangia is inherently equipped with financial-grade data synchronization capabilities of high concurrency, high availability, multi-tenant isolation and resource control.

- **Support Unstructured Transmission**
Transform the DataX framework and build a binary stream fast channel separately, suitable for pure data synchronization scenarios without data conversion。
### Interface preview

- **Task Status Self-check**
Monitor long-running tasks and tasks with abnormal status, release occupied resources in time and issue alarms。
![image](https://user-images.githubusercontent.com/27387830/171488936-2cea3ee9-4ef7-4309-93e1-e3b697bd3be1.png)

## Comparison With Existing Systems
Comparison of some existing data exchange tools and platforms:
## Core characteristics

| Function module | Description | Exchangis | DataX | Sqoop | DataLink | DBus |
| :----: | :----: |-------|-------|-------|-------|-------|
| UI | Integrated the convenient management interface and monitoring window | Integrated | None | None | Integrated |Integrated |
| Installation and deployment | Ease of deployment and third-party dependencies | One-click deployment, no dependencies | No dependencies | Rely on Hadoop environment | Rely on Zookeeper | Rely on a large number of third-party components |
| Data authority management | Multi-tenant permission configuration and data source permission control | Support | Not support | Not support | Not support | Support |
| |Dynamic limit transmission | Support | Partially supported, unable to adjust dynamically | Partially supported, unable to adjust dynamically | Support | Support,with Kafka |
| Data transmission| Unstructured data binary transmission | Support, fast channel | Not support | Not support | Not support,only transport record | Not support,need to be converted to a unified message format|
| | Embed processing code | Support,dynamic compilation | Not support | Not support | Not support | Partial support |
| | Transmission breakpoint recovery | Support(Not open source) | Not support | Not support | Support | Support |
| High availability | Mutiple services, failure does not affect the use | Application high availability, transmission single point(Distributed architecture planning) | Single point service(Open source version) | Multipoint transmission | Application、transmission high availability | Application、transmission high availability |
| System Management | Nodes、resources management | Support | Not support | Not support | Support | Support |
### 1. Lightweight datasource management

## Overall Design
- Based on Linkis DataSource, Exchangis abstracts all the necessary capabilities of the underlying data source as the Source and Sink of a synchronization job. A data source can be created with simple configuration.

### Architecture
- Special datasource version publishing management function supports version history datasource rollback, and one-click publishing does not need to configure historical datasources again.


### 2. High-stability and fast-response data synchronization task execution

- **Near-real-time task management**
Quickly capture information such as transmission task log and transmission rate, monitor and display various indicators of multi-task including CPU usage, memory usage, data synchronization record, etc., and support closing tasks in real time.

- **Task high concurrent transmission**
Multi-tasks are executed concurrently, and sub-tasks can be copied to show the status of each task in real time. Multi-tenant execution function can effectively prevent tasks from affecting each other during execution.

- **Self-check of task status**
Monitor long-running tasks and abnormal tasks, stop tasks and release occupied resources in time.


### 3. Integrate with DSS workflow, one-stop big data development portal

- Realize DSS AppConn's three-level specification, including the first-level SSO specification, the second-level organizational structure specification and the third-level development process specification.

- As the data exchange node of DSS workflow, it is the fundamental process in the whole workflow link, which provides a solid data foundation for the subsequent operation of workflow nodes.

## Overall Design

### Architecture Design

![架构设计](images/en_US/ch1/architecture.png)

![Architecture](images/en_US/ch1/architecture.png)

## Documents
[Quick Deploy](docs/zh_CN/ch1/exchangis_deploy_cn.md)
[User Manual](docs/zh_CN/ch1/exchangis_user_manual_cn.md)

## Communication
[Quick Deploy](/~https://github.com/WeDataSphere/Exchangis/blob/dev-1.0.0-rc/docs/zh_CN/ch1/exchangis_deploy_cn.md)
[User Manual](/~https://github.com/WeDataSphere/Exchangis/blob/dev-1.0.0-rc/docs/zh_CN/ch1/exchangis_user_manual_cn.md)

## Communication and contribution

If you desire immediate response, please kindly raise issues to us or scan the below QR code by WeChat and QQ to join our group:
If you want to get the fastest response, please mention issue to us, or scan the code into the group

![Communication](images/communication.png)
![communication](images/en_US/ch1/communication.png)

## License

Exchangis is under the Apache 2.0 License. See the [License](LICENSE) file for details.
Exchangis is under the Apache 2.0 License. See the [License](../../../LICENSE) file for details.

4 changes: 2 additions & 2 deletions assembly-package/config/application-exchangis.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
server:
port: 9500
port: 9321
spring:
application:
name: exchangis-server
eureka:
client:
serviceUrl:
defaultZone: http://127.0.0.1:20503/eureka/
defaultZone: http://127.0.0.1:3306/eureka/
instance:
metadata-map:
test: wedatasphere
Expand Down
4 changes: 4 additions & 0 deletions assembly-package/config/config.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
LINKIS_GATEWAY_HOST=
LINKIS_GATEWAY_PORT=
EXCHANGIS_PORT=
EUREKA_URL=
9 changes: 9 additions & 0 deletions assembly-package/config/db.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# 设置数据库的连接信息
# 包括IP地址、数据库名称、用户名、端口
MYSQL_HOST=
MYSQL_PORT=
MYSQL_USERNAME=
MYSQL_PASSWORD=
DATABASE=


39 changes: 28 additions & 11 deletions assembly-package/config/exchangis-server.properties
Original file line number Diff line number Diff line change
Expand Up @@ -15,38 +15,55 @@
#
#

wds.linkis.server.mybatis.datasource.url=jdbc:mysql://localhost:3306/database?useSSL=false&characterEncoding=UTF-8&allowMultiQueries=true
#wds.linkis.test.mode=true
wds.linkis.test.mode=false

wds.linkis.server.mybatis.datasource.url=jdbc:mysql://127.0.0.1:3306/exchangis?useSSL=false&characterEncoding=UTF-8&allowMultiQueries=true

wds.linkis.server.mybatis.datasource.username=username

wds.linkis.server.mybatis.datasource.password=password

wds.linkis.gateway.ip=127.0.0.1
wds.linkis.gateway.port=9001
wds.linkis.gateway.url=http://127.0.0.1:9001/

wds.linkis.log.clear=true

wds.linkis.server.version=v1

# datasource client
wds.exchangis.datasource.client.serverurl=
wds.exchangis.datasource.client.authtoken.key=DATASOURCE-AUTH
wds.exchangis.datasource.client.authtoken.value=DATASOURCE-AUTH
## datasource client
wds.exchangis.datasource.client.serverurl=http://127.0.0.1:9001
wds.exchangis.datasource.client.authtoken.key=EXCHANGIS-AUTH
wds.exchangis.datasource.client.authtoken.value=EXCHANGIS-AUTH
wds.exchangis.datasource.client.dws.version=v1

# launcher client
wds.exchangis.client.linkis.server-url=
wds.exchangis.client.linkis.token.value=DATASOURCE-AUTH
wds.exchangis.client.linkis.server-url=http://127.0.0.1:9001
wds.exchangis.client.linkis.token.value=EXCHANGIS-AUTH

wds.exchangis.datasource.extension.dir=exchangis-extds

##restful
wds.linkis.server.restful.scan.packages=com.webank.wedatasphere.exchangis.datasource.server.restful.api,\
com.webank.wedatasphere.exchangis.project.server.restful,\
com.webank.wedatasphere.exchangis.job.server.restful
wds.linkis.server.mybatis.mapperLocations=classpath*:com/webank/wedatasphere/exchangis/job/server/mapper/impl/*.xml:\
,classpath*:com/webank/wedatasphere/exchangis/project/server/mapper/impl/*.xml
wds.linkis.server.mybatis.mapperLocations=classpath*:com/webank/wedatasphere/dss/framework/appconn/dao/impl/*.xml,classpath*:com/webank/wedatasphere/dss/workflow/dao/impl/*.xml,\
classpath*:com/webank/wedatasphere/exchangis/job/server/mapper/impl/*.xml,\
classpath*:com/webank/wedatasphere/exchangis/project/server/mapper/impl/*.xml

wds.linkis.server.mybatis.BasePackage=com.webank.wedatasphere.exchangis.dao,\
com.webank.wedatasphere.exchangis.project.server.mapper,\
com.webank.wedatasphere.exchangis.job.server.mapper

com.webank.wedatasphere.linkis.configuration.dao,\
com.webank.wedatasphere.dss.framework.appconn.dao,\
com.webank.wedatasphere.dss.workflow.dao,\
com.webank.wedatasphere.linkis.metadata.dao,\
com.webank.wedatasphere.exchangis.job.server.mapper,\
com.webank.wedatasphere.exchangis.job.server.dao

wds.exchangis.job.task.scheduler.load-balancer.flexible.segments.min-occupy=0.25
wds.exchangis.job.task.scheduler.load-balancer.flexible.segments.max-occupy=0.5
#wds.exchangis.job.scheduler.group.max.running-jobs=4

wds.linkis.session.ticket.key=bdp-user-ticket-id

2 changes: 1 addition & 1 deletion assembly-package/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<artifactId>exchangis</artifactId>
<groupId>com.webank.wedatasphere.exchangis</groupId>
<version>1.0.0-RC1</version>
<version>1.0.0</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>assembly-package</artifactId>
Expand Down
Loading

0 comments on commit 9e04ddd

Please sign in to comment.