-
Notifications
You must be signed in to change notification settings - Fork 2
Detailed formatcheck.py documentation
Parameter(s):
- key - Item to be added or modified
- value - Unit of measurement to be associated with key
- dct - The Dictionary that this is being applied to
If key is already in the dictionary, add the value to the set associated with the key
Otherwise, add the associate the new key with a new set containing only value
[Ex 1.] key = "Gas", value = "mcf" -> { "Gas": {"mcf"} }
[Ex 2.] key = "Geothermal - Electrical Generation", value = "Kilowatt Hours"
-> { "Geothermal - Electrical Generation": {"Kilowatt Hours"} }
key = "Geothermal - Electrical Generation", value = "Thousands of Pounds"
-> { "Geothermal - Electrical Generation": {"Kilowatt Hours", "Thousands of Pounds"} }
Parameter(s):
- cols - Columns from Pandas DataFrame Checks cols for "Commodity" or "Product"
Returns "n/a" if "Commodity" and "Product" are both present or both missing
Otherwise it returns whichever is present
[Ex 1.] cols = ["Commodity"] -> returns "Commodity"
[Ex 2.] cols = ["Product"] -> returns "Product"
[Ex 3.] cols = ["Commodity", "Product"] -> returns "Commodity"
Parameter(s):
- name - Name of the Excel file
Field(s):
- lower - name in all lowercase letters
- prefixes = ["cy","fy","monthly","company","federal","native","production","revenue","disbribution"]
Returns a String based on the Excel file given
If any entries from prefixes are found in name, they will be added to the final String
[Ex] name = "federal_production_CY03-18" -> returns "cyfederalproduction_"
Parameter(s):
- string - String to be split
Returns a List of Strings separated either by the right-most opening parentheses "(" or the left-most comma ","
[Ex 1] string = "Gas (mcf)" -> ["Gas", "mcf"]
[Ex 2] string = "Geothermal - Electrical Generation, Kilowatt Hours"
= ["Geothermal - Electrical Generation", "Kilowatt Hours"]
[Ex 3] string = "Geothermal - sulfur" = ["Geothermal - sulfur", ""]
Parameter(s):
- file - A Pandas DataFrame
Returns column names as a List
Returns a dictionary of item and units. Calls split_unit and add_item
Product
Salt (tons)
Soda Ash (tons)
Sodium Bi-Carbonate (tons)
Gas (mcf)
Borate Products (tons)
Returns {"Salt" : "tons",
"Soda Ash" : ",
"Sodium Bi-Carbonate : "tons",
"Gas" : "mcf",
"Borate Products" : "tons"}
Parameter(s):
- type - Prefix for config file represented by a String
Returns an a dictionary based on the JSON file
Parameter(s):
- file - A Pandas DataFrame
Returns a tuple based on the number of "W"s found in Volume or "Withheld"s found in State
Calendar Year Land Category Land Class State ... Product Volume
2003 Onshore Federal CA ... Salt (tons) 33,622
2003 Onshore Federal CA ... Soda Ash (tons) W
2003 Onshore Federal CA ... Sodium Bi-Carbonate (tons) W
2003 Onshore Federal CA ... Gas (mcf) 4,885.6
2003 Onshore Federal Withheld ... Borate Products (tons) 31,124
Returns (2,1)
Parameter(s):
- file - A Pandas DataFrame
Iterates through default header and checks if specific Field Names are present.
Prints out if a Field Name is missing or in the wrong order
Unexpected Field Names are printed separately.
[Ex] default = ["Month", "Calendar Year", "Land Class", "Land Category", "Commodity", "Volume"]
columns = ["Moth", "Calendar Year", "Land Category", "Land Class", "Commodity", "Volume"]
-> "Month": Not Present, "Land Category": Unexpected Order, "Land Class": Unexpected Order
New Cols: Moth
Parameter(s):
- file - A Pandas DataFrame
Iterates through non-numerical fields and checks for unexpected entries.
Also checks Calendar Year
Parameter(s):
- file - A Pandas DataFrame
Iterates through specific columns and prints out cell with missing information
Parameter(s):
- file - A Pandas DataFrame
Iterates through column with expected units. Splits each entry by item and unit. Compare to default unit dictionary to determine if valid