You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Reduce the scope and focus of readepi to return data from web API data sources
Use ideas from DBI to create an interface specification and test suite for accessing web APIs
Implement the interface speficiation for Redcap, DHIS, Fingertips, GoData and ColOpenData
More...
Ideas from DBI
Define an interface specification for uniform and consistent data/metadata retrieval from web APIs as a front-end R package
Implement a back-end package that meets the interface specification for each web API (e.g. redcap) that is a data source
Define a test suite to guarantee back-end packages have implemented the interface as expected
In addition the following are nice to have in R
Return data/metadata as dataframes
Subset data at source if possible
The Interface
A simple interface for getting the data. The general consensus is around the following pseudo-code
datasource<- readepi(
connect, # required to establish connectionfilter, # rowsselect# columns
)
# Up to this point, no API request has been made. The previous steps of the pipeline serve to create the request we will make nowdata<- fetch_data(datasource)
# OR data<- connect_api(credentials) |>
select() |>
filter() |># Up to this point, no API request has been made. The previous steps of the pipeline serve to create the request we will make now
fetch_data()
Discussion
A case was made for removing functionality related to importing data from files because
readepi was only providing a thin wrapper around {rio}.
The same case can be made for removing functionality related to importing data from
relational databases because readepi is a wrapper around DBI.
After a quick review of DBI, my initial proposal was to implement readepi as a backend
to DBI. Upon investigation only 7 of the 45 functions defined in the DBI
spec are relevant to the concept of readepi. DBI is a spec tailored mainly to accessing
data in SQL based relational databases. Its interface abstraction leaks leaks some of this detail.
The semantics of web APIs are not the same as SQL semantics. Ideally any interface should be
implemented in the context of the R ecosystem and made to feel and behave as R objects
are expected to behave. This is where I believe readepi/readpapi should be positioned.
Hey @bahadzie - thanks for this extensive thought process! I am still getting familiar with everything, so please do tell me if I get things wrong. As a newbie here, I thought readepi's goal was to harmonize reading in data from various sources. That way, I would only need to know how to use readepi regardless of whether I'm reading in data from an API or a relational database.
Did you get feedback from people that they were confused by readepi's goal or preferred a different definition? Or is this from your (+your team's) assessment of doing the work creating the package?
On the topic of scope reduction: My experience with APIs is that it's very hard to know what you'll be getting back, if there is no known standard being used. There might be only a handful of sources or standards in epidemiology (I'm unaware of the landscape), which you are better in assessing. So my question for a redefinition to focus on APIs would be: Are the APIs we would want to read known or unknown in advance?
A generic readapi would of course still be interesting, but does that then still fit the intended purpose with which readepi was started?
readepi provides functions for importing epidemiological data into R from common health information systems.
README.Rmd - Thibaut Jombart commit 17 months ago.
tldr
More...
Ideas from DBI
front-end
R packageback-end
package that meets the interface specification for each web API (e.g. redcap) that is a data sourceIn addition the following are nice to have in R
The Interface
A simple interface for getting the data. The general consensus is around the following pseudo-code
Discussion
A case was made for removing functionality related to importing data from files because
readepi was only providing a thin wrapper around {rio}.
The same case can be made for removing functionality related to importing data from
relational databases because readepi is a wrapper around DBI.
After a quick review of DBI, my initial proposal was to implement readepi as a backend
to DBI. Upon investigation only 7 of the 45 functions defined in the DBI
spec are relevant to the concept of readepi. DBI is a spec tailored mainly to accessing
data in SQL based relational databases. Its interface abstraction leaks leaks some of this detail.
The semantics of web APIs are not the same as SQL semantics. Ideally any interface should be
implemented in the context of the R ecosystem and made to feel and behave as R objects
are expected to behave. This is where I believe readepi/readpapi should be positioned.
Appendix
A. readepi interface over time
NAMESPACE
read_from_file, read_from_ms_sql_server, read_from_redcap, read_stuff, readepi
NAMESPACE
readCredentials, read_from_file, read_from_ms_sql_server, read_from_redcap,
read_stuff, readepi, show_example_file
NAMESPACE
readCredentials, read_from_file, read_from_ms_sql_server, read_from_redcap,
read_stuff, readepi, showTables, show_example_file, subsetFields, subsetRecords
NAMESPACE
clade_assignment, genome_assembly, readCredentials, read_from_dhis2,
read_from_file, read_from_ms_sql_server, read_from_redcap, read_stuff,
readepi, showTables, show_example_file, subsetFields, subsetRecords
NAMESPACE
clade_assignment, genome_assembly, readCredentials, read_from_file,
read_from_ms_sql_server, read_from_redcap, read_stuff, readepi, showTables,
show_example_file, subsetFields, subsetRecords
NAMESPACE
clade_assignment, genome_assembly, readCredentials, read_from_dhis2,
read_from_file, read_from_ms_sql_server, read_from_redcap, read_stuff,
readepi, showTables, show_example_file, subsetFields, subsetRecords
NAMESPACE
getExtension, read_credentials, read_from_dhis2, read_from_file,
read_from_ms_sql_server, read_from_redcap, readepi, show_example_file,
show_tables, subset_fields, subset_records
NAMESPACE
getExtension, install_odbc_driver, read_credentials, read_from_dhis2,
read_from_file, read_from_ms_sql_server, read_from_redcap, readepi,
show_example_file, show_tables, subset_fields, subset_records
NAMESPACE
check_dhis2_attributes, getExtension, install_odbc_driver, read_credentials,
read_from_dhis2, read_from_file, read_from_ms_sql_server, read_from_redcap,
readepi, show_example_file, show_tables, subset_fields, subset_records
NAMESPACE
check_dhis2_attributes, getExtension, get_data_element_groups,
get_data_elements, get_data_sets, get_organisation_units, install_odbc_driver,
read_credentials, read_from_dhis2, read_from_file, read_from_ms_sql_server,
read_from_redcap, readepi, show_example_file, show_tables, subset_fields,
subset_records
NAMESPACE
check_dhis2_attributes, getExtension, get_data_elements, get_data_sets,
get_fingertips_metadata, get_indicatorID_from_domainID,
get_indicatorID_from_domainName, get_indicatorID_from_indicatorName,
get_indicatorID_from_profile, get_organisation_units, install_odbc_driver,
read_credentials, read_from_dhis2, read_from_file, read_from_fingertips,
read_from_ms_sql_server, read_from_redcap, readepi, show_example_file,
show_tables, subset_fields, subset_records
NAMESPACE
check_dhis2_attributes, getExtension, get_data_elements, get_data_sets, get_fingertips_metadata, get_indicatorID_from_domainID, get_indicatorID_from_domainName, get_indicatorID_from_indicatorName, get_indicatorID_from_profile, get_organisation_units, read_credentials, read_from_dhis2, read_from_file, read_from_fingertips, read_from_ms_sql_server, read_from_redcap, readepi, show_example_file, show_tables, subset_fields, subset_records
NAMESPACE
check_dhis2_attributes, getExtension, get_data_elements, get_data_sets, get_fingertips_metadata, get_indicatorID_from_domainID, get_indicatorID_from_domainName, get_indicatorID_from_indicatorName, get_indicatorID_from_profile, get_organisation_units, read_credentials, read_from_dhis2, read_from_file, read_from_fingertips, read_from_ms_sql_server, read_from_redcap, read_multiple_files, readepi, show_example_file, show_tables, subset_fields, subset_records
NAMESPACE
check_dhis2_attributes, get_data_elements, get_data_sets, get_extension, get_fingertips_metadata, get_ind_id_from_domain_id, get_ind_id_from_domain_name, get_ind_id_from_ind_name, get_ind_id_from_profile, get_organisation_units, read_credentials, read_from_dhis2, read_from_file, read_from_fingertips, read_from_ms_sql_server, read_from_redcap, read_multiple_files, readepi, show_example_file, show_tables, subset_fields, subset_records
NAMESPACE
check_dhis2_attributes, connect_to_server, detect_separator, fetch_data_from_query, get_data_elements, get_data_sets, get_extension, get_fingertips_metadata, get_ind_id_from_domain_id, get_ind_id_from_domain_name, get_ind_id_from_ind_name, get_ind_id_from_profile, get_organisation_units, identify_table_name, read_credentials, read_from_dhis2, read_from_file, read_from_fingertips, read_from_redcap, read_multiple_files, readepi, show_example_file, show_tables, sql_select_data, sql_select_entire_dataset, sql_select_fields_only, sql_select_records_and_fields, sql_select_records_only, sql_server_read_data, visualise_table
NAMESPACE
check_dhis2_attributes, connect_to_server, detect_separator, dhis2_subset_fields, dhis2_subset_records, fetch_data_from_query, get_base_name, get_data_elements, get_data_sets, get_extension, get_fingertips_metadata, get_ind_id_from_domain_id, get_ind_id_from_domain_name, get_ind_id_from_ind_name, get_ind_id_from_profile, get_organisation_units, get_read_file_args, get_read_fingertips_args, get_relevant_data_elt_group, get_relevant_dataset, get_relevant_organisation_unit, identify_table_name, import_redcap_data, read_credentials, read_from_dhis2, read_from_file, read_from_fingertips, read_from_redcap, read_multiple_files, readepi, redcap_get_results, redcap_read, redcap_read_fields, redcap_read_records, redcap_read_rows_columns, show_example_file, show_tables, sql_select_data, sql_select_entire_dataset, sql_select_fields_only, sql_select_records_and_fields, sql_select_records_only, sql_server_read_data, visualise_table
NAMESPACE
check_dhis2_attributes, connect_to_server, detect_separator, dhis2_subset_fields, dhis2_subset_records, fetch_data_from_query, fingertips_subset_columns, fingertips_subset_rows, get_base_name, get_data_elements, get_data_sets, get_extension, get_fingertips_metadata, get_ind_id_from_domain_id, get_ind_id_from_domain_name, get_ind_id_from_ind_name, get_ind_id_from_profile, get_organisation_units, get_profile_name, get_read_file_args, get_read_fingertips_args, get_relevant_data_elt_group, get_relevant_dataset, get_relevant_organisation_unit, identify_table_name, import_redcap_data, read_credentials, read_from_dhis2, read_from_file, read_from_fingertips, read_from_redcap, read_multiple_files, readepi, redcap_get_results, redcap_read, redcap_read_fields, redcap_read_records, redcap_read_rows_columns, show_example_file, show_tables, sql_select_data, sql_select_entire_dataset, sql_select_fields_only, sql_select_records_and_fields, sql_select_records_only, sql_server_read_data, visualise_table
NAMESPACE
get_dhis2_attributes, readepi, show_example_file, show_tables,
B. DBI Reference Functions
*functions that are relevant to {readepi}
Connecting and disconnecting
*dbConnect(drv, ...) - Create a connection to a DBMS
*dbGetInfo(dbObj, ...) - Get DBMS metadata
dbDisconnect(conn, ...) - Disconnect (close) a connection
dbCanConnect(drv, ...) - Check if a connection to a DBMS can be established
dbIsValid(dbObj, ...) - Is this DBMS object still valid?
dbIsReadOnly(dbObj, ...) - Is this DBMS object read only?
dbGetConnectArgs(drv, eval = TRUE, ...) - Get connection arguments
Tables
*dbReadTable(conn, name, ...) - Read database tables as data frames
*dbListTables(conn, ...) - List remote tables
*dbListFields(conn, name, ...) - List field names of a remote table
*dbExistsTable(conn, name, ...) - Does a table exist?
*dbListObjects(conn, prefix = NULL, ...) - List remote objects
dbWriteTable(conn, name, value, ...) - Copy data frames to database tables
dbCreateTable(conn, name, fields, ..., row.names = NULL, temporary = FALSE) - Create a table in the database
dbAppendTable(conn, name, value, ..., row.names = NULL) - Insert rows into a table
dbRemoveTable(conn, name, ...) Remove a table from the database
sqlRownamesToColumn(df, row.names = NA) / sqlColumnToRownames(df, row.names = NA) - Convert row names back and forth between columns
Queries and statements
dbGetQuery(conn, statement, ...) - Retrieve results from a query
dbExecute(conn, statement, ...) - Change database state
Results
dbSendQuery(conn, statement, ...) - Execute a query on a given database connection
dbSendStatement(conn, statement, ...) - Execute a data manipulation statement on a given database connection
dbBind(res, params, ...) / dbBindArrow(res, params, ...) - Bind values to a parameterized/prepared statement
dbFetch(res, n = -1, ...) / fetch(res, n = -1, ...) - Fetch records from a previously executed query
dbGetRowCount(res, ...) - The number of rows fetched so far
dbGetRowsAffected(res, ...) - The number of rows affected
dbGetStatement(res, ...) - Get the statement associated with a result set
dbHasCompleted(res, ...) - Completion status
dbColumnInfo(res, ...) - Information about result types
dbClearResult(res, ...) -Clear a result set
Transactions
dbBegin(conn, ...) / dbCommit(conn, ...) / dbRollback(conn, ...)- Begin/commit/rollback SQL transactions
dbWithTransaction(conn, code, ...) / dbBreak() - Self-contained SQL transactions
SQL
SQL(x, ..., names = NULL) - SQL quoting
dbDataType(dbObj, obj, ...) - Determine the SQL data type of an object
dbQuoteIdentifier(conn, x, ...) - Quote identifiers
dbQuoteLiteral(conn, x, ...) - Quote literal values
dbQuoteString(conn, x, ...) - Quote literal strings
sqlCreateTable(con, table, fields, row.names = NA, temporary = FALSE, ...) - Compose query to create a simple table
sqlAppendTable(con, table, values, row.names = NA, ...) - Compose query to insert rows into a table
sqlData(con, value, row.names = NA, ...) - Convert a data frame into form suitable for upload to an SQL database
dbUnquoteIdentifier(conn, x, ...) - Unquote identifiers
sqlInterpolate(conn, sql, ..., .dots = list()) - Safely interpolate values into an SQL string
Classes
DBIObject-class
DBIDriver-class
DBIConnection-class
DBIResult-class
DBIConnector-class
The text was updated successfully, but these errors were encountered: