Usage

Categorizations

Included categorizations

In the climate_categories package, the categorizations are available directly at the top-level namespace, and as a dictionary in .cats:

[1]:
import climate_categories

climate_categories.cats
[1]:
{'IPCC1996': <Categorization IPCC1996 'IPCC GHG emission categories (1996)' with 233 categories>,
 'IPCC2006': <Categorization IPCC2006 'IPCC GHG emission categories (2006)' with 290 categories>,
 'IPCC2006_PRIMAP': <Categorization IPCC2006_PRIMAP 'IPCC GHG emission categories (2006) with custom categories used in PRIMAP' with 303 categories>,
 'CRF1999': <Categorization CRF1999 'Common Reporting Format GHG emissions categories (1999)' with 400 categories>,
 'CRF2013': <Categorization CRF2013 'Common Reporting Format GHG emissions categories (2013)' with 729 categories>,
 'CRF2013_2021': <Categorization CRF2013_2021 'Common Reporting Format GHG emissions categories (2013). Extended for 2021 CRF submissions.' with 960 categories>,
 'CRF2013_2022': <Categorization CRF2013_2022 'Common Reporting Format GHG emissions categories (2013). Extended for 2022 CRF submissions.' with 966 categories>,
 'CRF2013_2023': <Categorization CRF2013_2023 'Common Reporting Format GHG emissions categories (2013). Extended for 2023 CRF submissions.' with 967 categories>,
 'CRFDI': <Categorization CRFDI 'CRF GHG emission categories (DI query interface)' with 409 categories>,
 'CRFDI_class': <Categorization CRFDI_class 'CRF GHG emission categories (DI query interface) + classifications' with 1133 categories>,
 'BURDI': <Categorization BURDI 'BUR GHG emission categories (DI query interface)' with 237 categories>,
 'BURDI_class': <Categorization BURDI_class 'BUR GHG emission categories (DI query interface) + classifications' with 395 categories>,
 'GCB': <Categorization GCB 'Global Carbon Budget CO2 Emissions' with 9 categories>,
 'RCMIP': <Categorization RCMIP 'Emissions categories from the Reduced Complexity Model Intercomparison Project (RCMIP)' with 87 categories>,
 'gas': <Categorization gas 'climate-forcing gases' with 242 categories>,
 'ISO3': <Categorization ISO3 'ISO 3166-1 countries with climate-relevant groupings' with 264 categories>,
 'ISO3_GCAM': <Categorization ISO3_GCAM 'ISO 3166-1 countries with climate-relevant groupings with GCAM regions' with 456 categories>}
[2]:
climate_categories.IPCC2006
[2]:
<Categorization IPCC2006 'IPCC GHG emission categories (2006)' with 290 categories>
[3]:
climate_categories.cats["IPCC2006"]
[3]:
<Categorization IPCC2006 'IPCC GHG emission categories (2006)' with 290 categories>

Metadata for each categorization are accessible as properties:

[4]:
print(climate_categories.IPCC2006.name)
print(climate_categories.IPCC2006.title)
print(climate_categories.IPCC2006.comment)
print(climate_categories.IPCC2006.references)
print(climate_categories.IPCC2006.institution)
print(climate_categories.IPCC2006.last_update)
print(climate_categories.IPCC2006.version)
IPCC2006
IPCC GHG emission categories (2006)
IPCC classification of green-house gas emissions into categories, 2006 edition
IPCC 2006, 2006 IPCC Guidelines for National Greenhouse Gas Inventories, Prepared by the National Greenhouse Gas Inventories Programme, Eggleston H.S., Buendia L., Miwa K., Ngara T. and Tanabe K. (eds). Volume 1, Chapter 8, Table 8.2, https://www.ipcc-nggip.iges.or.jp/public/2006gl/vol1.html
IPCC
2010-06-30
2006

The categorization can be used as a dictionary mapping category codes to categories:

[5]:
climate_categories.IPCC2006["1.A"]
[5]:
<IPCC2006: '1.A'>

You can also query using alternative spellings of the code:

[6]:
climate_categories.IPCC2006["1A"]
[6]:
<IPCC2006: '1.A'>

For the categories, metadata is also available: a title, maybe a comment, all of its codes and possibly additional non-standard information in the info dictionary:

[7]:
one_a = climate_categories.IPCC2006["1.A"]
print(one_a.title)
print(one_a.comment)
print(one_a.codes)
print(one_a.info)
Fuel Combustion Activities
Emissions from the intentional oxidation of materials within an apparatus that is designed to raise heat and provide it either as heat or as mechanical work to a process or for use away from the apparatus.
('1.A', '1A')
{'gases': ['CO2', 'CH4', 'N2O', 'NOx', 'CO', 'NMVOC', 'SO2'], 'corresponding_categories_IPCC1996': ['1A']}

For hierarchical categorizations, you can also query for parent and child categories. Note that a list of sets of children is returned in case a category can be composed differently:

[8]:
climate_categories.IPCC2006["1.A"].children
[8]:
[{<IPCC2006: '1.A.1'>,
  <IPCC2006: '1.A.2'>,
  <IPCC2006: '1.A.3'>,
  <IPCC2006: '1.A.4'>,
  <IPCC2006: '1.A.5'>}]
[9]:
climate_categories.IPCC2006["1.A"].parents
[9]:
{<IPCC2006: '1'>}

Finally, you can check if a categorization is hierarchical, and for hierarchical categorizations you can check if the sum of all child categories should be equal to the sum of parent categories:

[10]:
print(f"Hierachical: {climate_categories.IPCC2006.hierarchical}")
print(f"Total sum: {climate_categories.IPCC2006.total_sum}")
Hierachical: True
Total sum: True

Visualization

The relationships between categories in a hierarchical categorization can be visualized in a tree-like fashion:

[11]:
# Limit the maximum depth shown using `maxdepth`
print(climate_categories.IPCC2006.show_as_tree(maxdepth=2))
0 National Total
├1 Energy
├2 Industrial Processes and Product Use
├3 Agriculture, Forestry, and Other Land Use
├4 Waste
╰5 Other

[12]:
# Print only a part of tree using `root`
print(climate_categories.IPCC2006.show_as_tree(root="1A1"))
1.A.1 Energy Industries
├1.A.1.a Main Activity Electricity and Heat Production
│├1.A.1.a.i Electricity Generation
│├1.A.1.a.ii Combined Heat and Power Generation (CHP)
│╰1.A.1.a.iii Heat Plants
├1.A.1.b Petroleum Refining
╰1.A.1.c Manufacture of Solid Fuels and Other Energy Industries
 ├1.A.1.c.i Manufacture of Solid Fuels
 ╰1.A.1.c.ii Other Energy Industries

Child sets in hierarchical categorizations

For hierarchical categorizations, it is possible that a category can be composed of multiple child sets. As an example, in emissions reporting it is possible to report industrial emissions either by industry sectors, or by fuel. In this case, the parent industry category has two sets of children: either all the industry sectors, or all of the fuels.

You can see this with two toy example categorizations included in climate_categories:

[13]:
import climate_categories.tests.examples

HierEx = climate_categories.tests.examples.HierEx()
print(
    "Hierarchical categorization with only one way to subdivide the top category:"
)
print(HierEx.show_as_tree())

HierAltEx = climate_categories.tests.examples.HierAltEx()
print(
    "\nHierarchical categorization with two ways to subdivide the top category:"
)
print(HierAltEx.show_as_tree())
Hierarchical categorization with only one way to subdivide the top category:
0 Total
├1 Sector 1
├2 Sector 2
╰3 Sector 3


Hierarchical categorization with two ways to subdivide the top category:
0 Total
╠╤══ ('0 Total's children, option 1)
║├1 Sector 1
║├2 Sector 2
║╰3 Sector 3
╠╕ ('0 Total's children, option 2)
║├a Fuel a
║├b Fuel b
║╰c Fuel c
╚═══

As you can see, alternative ways to subdivide a category are indicated with double lines in show_as_tree. Programmatically, the difference is clear because children contains a list of possible child sets:

[14]:
HierEx["0"].children
[14]:
[{<HierEx: '1'>, <HierEx: '2'>, <HierEx: '3'>}]
[15]:
HierAltEx["0"].children
[15]:
[{<HierAltEx: '1'>, <HierAltEx: '2'>, <HierAltEx: '3'>},
 {<HierAltEx: 'a'>, <HierAltEx: 'b'>, <HierAltEx: 'c'>}]

Finding leaf descendants

For purposes like re-calculating top-level categories from simple leaf categories, it is useful to find the descendants of a category which have no children. Use the leaf_children property to do so.

[16]:
HierEx["0"].leaf_children
[16]:
[{<HierEx: '1'>, <HierEx: '2'>, <HierEx: '3'>}]

Extending categorizations

Often, you want to use a common categorization, but for one reason or another, you have to add a couple of categories. This is possible:

[17]:
IPCC2006_lulucf_extra = climate_categories.IPCC2006.extend(
    name="IPCC2006_lulucf_extra",
    categories={
        "M0.EL": {
            "title": "Total excluding lulucf",
            "comment": "All emissions and removals except emissions from land use, land use change, and forestry",
        }
    },
    children=[("M0.EL", ("1", "2", "4", "5"))],
)

print(IPCC2006_lulucf_extra.name)
print(IPCC2006_lulucf_extra.title)
print(IPCC2006_lulucf_extra.comment)
print(IPCC2006_lulucf_extra.show_as_tree(maxdepth=2))
IPCC2006_IPCC2006_lulucf_extra
IPCC GHG emission categories (2006) + IPCC2006_lulucf_extra
IPCC classification of green-house gas emissions into categories, 2006 edition extended by IPCC2006_lulucf_extra
0 National Total
├1 Energy
├2 Industrial Processes and Product Use
├3 Agriculture, Forestry, and Other Land Use
├4 Waste
╰5 Other

M0.EL Total excluding lulucf
├1 Energy
├2 Industrial Processes and Product Use
├4 Waste
╰5 Other

If the canonical top level category of hierarchical categorizations is defined, you can also calculate the level of a category in the hierarchy:

[18]:
print(climate_categories.IPCC2006["0"].level)
print(climate_categories.IPCC2006["1.A"].level)
1
3

Pandas integration

For each categorization, the categories are also available as a pandas DataFrame:

[19]:
climate_categories.IPCC2006.df
[19]:
title comment alternative_codes children
0 National Total All emissions and removals () ((1, 2, 3, 4, 5),)
1 Energy This category includes all GHG emissions arisi... () ((1.A, 1.B, 1.C),)
1.A Fuel Combustion Activities Emissions from the intentional oxidation of ma... (1A,) ((1.A.1, 1.A.2, 1.A.3, 1.A.4, 1.A.5),)
1.A.1 Energy Industries Comprises emissions from fuels combusted ... (1A1,) ((1.A.1.a, 1.A.1.b, 1.A.1.c),)
1.A.1.a Main Activity Electricity and Heat Production Sum of emissions from main activity producers ... (1A1a,) ((1.A.1.a.i, 1.A.1.a.ii, 1.A.1.a.iii),)
... ... ... ... ...
4.D.2 Industrial Wastewater Treatment and Discharge Treatment and discharge of liquid wastes and s... (4D2,) ()
4.E Other (Please Specify) Release of GHGs from other waste handling acti... (4E,) ()
5 Other None () ((5.A, 5.B),)
5.A Indirect N2O Emissions from The Atmospheric De... Excluding indirect emissions from NOx and NH3 ... (5A,) ()
5.B Other (Please Specify) Only use this category exceptionally, for any ... (5B,) ()

290 rows × 4 columns

Finding unknown codes

Searching for a code in all included categorizations is possible using the find_code function:

[20]:
climate_categories.find_code("1A")
[20]:
{<CRF2013_2021: '1.A'>,
 <IPCC1996: '1.A'>,
 <CRF1999: '1.A'>,
 <BURDI_class: '1.A'>,
 <CRF2013_2023: '1.A'>,
 <CRF2013: '1.A'>,
 <IPCC2006_PRIMAP: '1.A'>,
 <IPCC2006: '1.A'>,
 <CRF2013_2022: '1.A'>,
 <BURDI: '1.A'>,
 <CRFDI: '1.AA'>,
 <CRFDI_class: '1.AA'>}

Conversions

Included conversions

You can get the rules for conversion between categories using the conversion objects:

[21]:
conv = climate_categories.IPCC1996.conversion_to("IPCC2006")
conv
[21]:
<Conversion 'IPCC1996' <-> 'IPCC2006' with 153 rules>

The conversion object can be queried for metadata:

[22]:
print(conv.categorization_a)
print(conv.categorization_b)
print(conv.auxiliary_categorizations)
print(conv.comment)
print(conv.institution)
print(conv.references)
print(conv.version)
print(conv.last_update)
IPCC1996
IPCC2006
[<Categorization gas 'climate-forcing gases' with 242 categories>]
None
IPCC
IPCC 2006, 2006 IPCC Guidelines for National Greenhouse Gas Inventories, Prepared by the National Greenhouse Gas Inventories Programme, Eggleston H.S., Buendia L., Miwa K., Ngara T. and Tanabe K. (eds). Volume 1, Chapter 8d, Table 8.2  https://www.ipcc-nggip.iges.or.jp/public/2006gl/vol1.html
None
2021-07-09 00:00:00

More importantly, the conversion object also holds the actual conversion rules:

[23]:
len(conv.rules)
[23]:
153
[24]:
conv.rules[0]
[24]:
ConversionRule(factors_categories_a={<IPCC1996: '0'>: 1}, factors_categories_b={<IPCC2006: '0'>: 1}, auxiliary_categories={<Categorization gas 'climate-forcing gases' with 242 categories>: set()}, comment='', csv_line_number=5, csv_original_text='0,,0,', cardinality_a='one', cardinality_b='one', is_restricted=False)

For the rules, the most important parts are the categories and factors for each side and the specification for which auxiliary categories the rule is valid:

[25]:
rule = conv.rules[0]
print(rule.factors_categories_a)
print(rule.factors_categories_b)
print(rule.auxiliary_categories)
{<IPCC1996: '0'>: 1}
{<IPCC2006: '0'>: 1}
{<Categorization gas 'climate-forcing gases' with 242 categories>: set()}

In this example, the category 1 of the IPCC1996 categorization equals exactly the category 1 of the IPCC2006 categorization (i.e. for both, the factor is unity) and the rule is valid for all gases (the set of gases is empty, meaning an unrestricted rule).

Often, you might be interested in rules concerning a specific set of categories, which you can also fetch:

[26]:
one_or_two_rules = conv.relevant_rules(
    {climate_categories.IPCC1996["1"], climate_categories.IPCC1996["2"]}
)

for rule in one_or_two_rules:
    print("###")
    print(rule.format_human_readable())
###
IPCC1996 1 Energy
⮁
IPCC2006 1 Energy

###
IPCC1996 2 Industrial Processes
IPCC1996 3 Solvent and Other Product Use
⮁
IPCC2006 2 Industrial Processes and Product Use

Here, we used the format_human_readable() function to get a nicely formatted output. You can see that for this conversion, the category 1 maps cleanly between the two categorizations, but for category 2, you need to combine categories 2 and 3 from IPCC1996 to get category 2 from IPCC2006.

Finding inconsistencies

Dealing with large conversion rule sets can be daunting, so there are a few functions to help when checking conversions:

[27]:
# describe all of the conversion rules
# We only print the start of the description because it is looong
print(conv.describe_detailed()[:200] + "…")
# Mapping between IPCC1996 and IPCC2006

## Simple direct mappings

IPCC1996 0 National Total
IPCC2006 0 National Total

IPCC1996 1 Energy
IPCC2006 1 Energy

IPCC1996 1.A Fuel Combustion Activities
IP…
[28]:
# Find potential over counting problems
suspected_problems = conv.find_over_counting_problems()
for p in suspected_problems:
    print(p)
    print()
<IPCC1996: '6.A.1'> is possibly counted multiple times
involved leave groups categories: [[<IPCC2006: '4.A.1'>], [<IPCC2006: '4.A.3'>]]
involved rules: <Rule '0,,0,' from line 5>, <Rule '6,,4,' from line 146>, <Rule '6.A,,4.A,' from line 147>, <Rule '6.A.1,,4.A.1,' from line 148>, <Rule '6.A.1 + 6.A.2,,4.A.3,if possible prefer the individual rules above' from line 150>.

<IPCC1996: '6.A.2'> is possibly counted multiple times
involved leave groups categories: [[<IPCC2006: '4.A.2'>], [<IPCC2006: '4.A.3'>]]
involved rules: <Rule '0,,0,' from line 5>, <Rule '6,,4,' from line 146>, <Rule '6.A,,4.A,' from line 147>, <Rule '6.A.2,,4.A.2,' from line 149>, <Rule '6.A.1 + 6.A.2,,4.A.3,if possible prefer the individual rules above' from line 150>.

<IPCC1996: '6.A.3'> is possibly counted multiple times
involved leave groups categories: [[<IPCC2006: '4.A'>], [<IPCC2006: '4.B'>]]
involved rules: <Rule '0,,0,' from line 5>, <Rule '6,,4,' from line 146>, <Rule '6.A,,4.A,' from line 147>, <Rule '6.A.3,,4.B,' from line 151>.

<IPCC1996: '6.B.3'> is possibly counted multiple times
involved leave groups categories: [[<IPCC2006: '4.D'>], [<IPCC2006: '4.E'>]]
involved rules: <Rule '0,,0,' from line 5>, <Rule '6,,4,' from line 146>, <Rule '6.B,,4.D,' from line 152>, <Rule '6.D + 6.B.3,,4.E,' from line 156>.

<IPCC2006: '2.B.7'> is possibly counted multiple times
involved leave groups categories: [[<IPCC1996: '2.A.3'>, <IPCC1996: '2.A.4'>], [<IPCC1996: '2.B'>, <IPCC1996: '3.C'>]]
involved rules: <Rule '0,,0,' from line 5>, <Rule '2 + 3,,2,' from line 81>, <Rule '2.A.3 + 2.A.4,,2.A.3 + 2.A.4 + 2.B.7,' from line 85>, <Rule '2.B + 3.C,,2.B,' from line 88>.

<IPCC2006: '2.B.9'> is possibly counted multiple times
involved leave groups categories: [[<IPCC1996: '2.B'>, <IPCC1996: '3.C'>], [<IPCC1996: '2.E'>]]
involved rules: <Rule '0,,0,' from line 5>, <Rule '2 + 3,,2,' from line 81>, <Rule '2.B + 3.C,,2.B,' from line 88>, <Rule '2.E,,2.B.9,' from line 103>.

<IPCC2006: '2.B.9.a'> is possibly counted multiple times
involved leave groups categories: [[<IPCC1996: '2.B'>, <IPCC1996: '3.C'>], [<IPCC1996: '2.E.1'>]]
involved rules: <Rule '0,,0,' from line 5>, <Rule '2 + 3,,2,' from line 81>, <Rule '2.B + 3.C,,2.B,' from line 88>, <Rule '2.E,,2.B.9,' from line 103>, <Rule '2.E.1,,2.B.9.a,' from line 104>.

<IPCC2006: '2.B.9.b'> is possibly counted multiple times
involved leave groups categories: [[<IPCC1996: '2.B'>, <IPCC1996: '3.C'>], [<IPCC1996: '2.E.2'>]]
involved rules: <Rule '0,,0,' from line 5>, <Rule '2 + 3,,2,' from line 81>, <Rule '2.B + 3.C,,2.B,' from line 88>, <Rule '2.E,,2.B.9,' from line 103>, <Rule '2.E.2,,2.B.9.b,' from line 105>.

Note that find_over_counting_problems at the moment can’t reliably detect all over counting problems and also some suspected problems might be fine under closer examination. Use this function only to generate hints for possible problems.

[29]:
# Find unmapped categories
missing_1996, missing_2006 = conv.find_unmapped_categories()
# returns sets of missing categories
# unfortunately, this conversion is not very complete:
print(len(missing_1996))
print(len(missing_2006))
71
110
[30]:
# the sets just contain regular categories which were forgotten
next(iter(missing_1996))
[30]:
<IPCC1996: '5.B.2'>

Viewing the full conversion

To view the full conversion, also consider directly loading the source CSVs from the climate_categories/data/ folder in a spreadsheet program.