Remove df property and change to_pandas_dataframe to to_pandas #146

tamargrey · 2020-09-24T22:21:40Z

closes #116
closes #147
closes #148

removes .df property
changes to_pandas_dataframe to to_pandas
makes dataframe a protected attribute by changing to .dataframe
remove replace_none parameter and don't fillna in the DataTables init

codecov · 2020-09-24T22:32:12Z

Codecov Report

Merging #146 into main will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #146   +/-   ##
=======================================
  Coverage   99.82%   99.82%           
=======================================
  Files          20       20           
  Lines        1740     1742    +2     
=======================================
+ Hits         1737     1739    +2     
  Misses          3        3

Impacted Files	Coverage Δ
woodwork/data_table.py	`100.00% <100.00%> (+0.38%)`	⬆️
woodwork/tests/data_table/test_datatable.py	`100.00% <100.00%> (ø)`
woodwork/tests/testing_utils/data_table_utils.py	`100.00% <100.00%> (ø)`
woodwork/tests/conftest.py	`90.90% <0.00%> (-9.10%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 466a702...e82cbe5. Read the comment docs.

thehomebrewnerd · 2020-09-25T13:10:41Z

I'm still thinking about whether to_pandas should return a copy of the dataframe or a reference to the dataframe (same with the .dataframe attribute on the table).

Consider this scenario:

User creates a data table object that contains an integer column - we set this up with Integer logical type and Int64 dtype
User accesses the dataframe from this data table to use in other calculations and during this the integer column is modified in place - say by a division operation that converts the integers into floats
Now, if the user runs dt.types on this DataTable, the LogicalType will still be Integer, but the underlying dataframe columns will have a dtype of float64

We are now out of sync - and the logical type information on the DataTable is not accurate. Should we allow this to happen? Same type of thing could happen if a string column got converted into integers/floats.

gsheni

minor sugggestions

woodwork/data_table.py

gsheni · 2020-09-25T17:30:13Z

woodwork/data_table.py

+    def to_pandas(self, copy=False):
+        """Retrieves the DataTable's underlying dataframe. 
+
+        Note: Do not modify the dataframe unless copy=True has been set to avoid unexpected behavior


gsheni

lgtm

gsheni suggested changes Sep 25, 2020

View reviewed changes

woodwork/data_table.py Outdated Show resolved Hide resolved

woodwork/data_table.py Outdated Show resolved Hide resolved

gsheni reviewed Sep 25, 2020

View reviewed changes

Tamar Grey added 9 commits September 25, 2020 13:55

Remove df property

1b0c0ef

Change to to_pandas

5cd1eac

Fix after rebase

4488553

Add changelog

c644d44

Change dataframe field to _dataframe

f13a986

remove replace_none param

34bb4a5

Update changelog

9fbe329

Add cop yparam to to_pandas

55047bf

fix-linting

e82cbe5

tamargrey force-pushed the to_pandas branch from 6828bbc to e82cbe5 Compare September 25, 2020 18:00

gsheni self-requested a review September 25, 2020 18:38

gsheni approved these changes Sep 25, 2020

View reviewed changes

gsheni assigned tamargrey Sep 25, 2020

tamargrey merged commit afc94eb into main Sep 25, 2020

gsheni mentioned this pull request Sep 28, 2020

v0.0.2 #158

Merged

gsheni deleted the to_pandas branch October 16, 2020 20:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove df property and change to_pandas_dataframe to to_pandas #146

Remove df property and change to_pandas_dataframe to to_pandas #146

tamargrey commented Sep 24, 2020 •

edited

Loading

codecov bot commented Sep 24, 2020 •

edited

Loading

thehomebrewnerd commented Sep 25, 2020

gsheni left a comment •

edited

Loading

gsheni Sep 25, 2020

gsheni left a comment

Remove df property and change to_pandas_dataframe to to_pandas #146

Remove df property and change to_pandas_dataframe to to_pandas #146

Conversation

tamargrey commented Sep 24, 2020 • edited Loading

codecov bot commented Sep 24, 2020 • edited Loading

Codecov Report

thehomebrewnerd commented Sep 25, 2020

gsheni left a comment • edited Loading

Choose a reason for hiding this comment

gsheni Sep 25, 2020

Choose a reason for hiding this comment

gsheni left a comment

Choose a reason for hiding this comment

tamargrey commented Sep 24, 2020 •

edited

Loading

codecov bot commented Sep 24, 2020 •

edited

Loading

gsheni left a comment •

edited

Loading