{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": true, "pycharm": { "name": "#%% md\n" } }, "source": [ "# Usage\n", "\n", "## Categorizations\n", "\n", "### Included categorizations\n", "\n", "In the `climate_categories` package, the categorizations are available\n", "directly at the top-level namespace, and as a dictionary in `.cats`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "import climate_categories\n", "\n", "climate_categories.cats" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "climate_categories.IPCC2006" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "climate_categories.cats[\"IPCC2006\"]" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Metadata for each categorization are accessible as properties:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "print(climate_categories.IPCC2006.name)\n", "print(climate_categories.IPCC2006.title)\n", "print(climate_categories.IPCC2006.comment)\n", "print(climate_categories.IPCC2006.references)\n", "print(climate_categories.IPCC2006.institution)\n", "print(climate_categories.IPCC2006.last_update)\n", "print(climate_categories.IPCC2006.version)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "The categorization can be used as a dictionary mapping category codes to categories:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "climate_categories.IPCC2006[\"1.A\"]" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "You can also query using alternative spellings of the code:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "climate_categories.IPCC2006[\"1A\"]" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "For the categories, metadata is also available: a `title`, maybe a\n", "`comment`, all of its `codes` and possibly additional non-standard information\n", "in the `info` dictionary:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "one_a = climate_categories.IPCC2006[\"1.A\"]\n", "print(one_a.title)\n", "print(one_a.comment)\n", "print(one_a.codes)\n", "print(one_a.info)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "For hierarchical categorizations, you can also query for parent and child\n", "categories.\n", "Note that a list of sets of children is returned in case a category\n", "can be composed differently:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "climate_categories.IPCC2006[\"1.A\"].children" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "climate_categories.IPCC2006[\"1.A\"].parents" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Finally, you can check if a categorization is hierarchical, and\n", "for hierarchical categorizations you can check if the sum of all\n", "child categories should be equal to the sum of parent categories:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "print(f\"Hierachical: {climate_categories.IPCC2006.hierarchical}\")\n", "print(f\"Total sum: {climate_categories.IPCC2006.total_sum}\")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Visualization\n", "\n", "The relationships between categories in a hierarchical categorization can be visualized\n", "in a tree-like fashion:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# Limit the maximum depth shown using `maxdepth`\n", "print(climate_categories.IPCC2006.show_as_tree(maxdepth=2))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# Print only a part of tree using `root`\n", "print(climate_categories.IPCC2006.show_as_tree(root=\"1A1\"))" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Child sets in hierarchical categorizations\n", "\n", "For hierarchical categorizations, it is possible that a category can be composed of multiple child sets.\n", "As an example, in emissions reporting it is possible to report industrial emissions either by industry sectors, or by fuel.\n", "In this case, the parent `industry` category has two sets of children: either all the industry sectors, or all of the fuels.\n", "\n", "You can see this with two toy example categorizations included in `climate_categories`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "import climate_categories.tests.examples\n", "\n", "HierEx = climate_categories.tests.examples.HierEx()\n", "print(\"Hierarchical categorization with only one way to subdivide the top category:\")\n", "print(HierEx.show_as_tree())\n", "\n", "HierAltEx = climate_categories.tests.examples.HierAltEx()\n", "print(\"\\nHierarchical categorization with two ways to subdivide the top category:\")\n", "print(HierAltEx.show_as_tree())" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "As you can see, alternative ways to subdivide a category are indicated with double lines in `show_as_tree`.\n", "Programmatically, the difference is clear because `children` contains a list of possible child sets:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "HierEx[\"0\"].children" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "HierAltEx[\"0\"].children" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "### Finding leaf descendants\n", "\n", "For purposes like re-calculating top-level categories from simple leaf categories, it\n", "is useful to find the descendants of a category which have no children. Use the\n", "`leaf_children` property to do so." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "HierEx[\"0\"].leaf_children" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Extending categorizations\n", "\n", "Often, you want to use a common categorization, but for one reason or\n", "another, you have to add a couple of categories. This is possible:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "IPCC2006_lulucf_extra = climate_categories.IPCC2006.extend(\n", " name=\"IPCC2006_lulucf_extra\",\n", " categories={\n", " \"M0.EL\": {\n", " \"title\": \"Total excluding lulucf\",\n", " \"comment\": \"All emissions and removals except emissions from land use, land use change, and forestry\",\n", " }\n", " },\n", " children=[(\"M0.EL\", (\"1\", \"2\", \"4\", \"5\"))],\n", ")\n", "\n", "print(IPCC2006_lulucf_extra.name)\n", "print(IPCC2006_lulucf_extra.title)\n", "print(IPCC2006_lulucf_extra.comment)\n", "print(IPCC2006_lulucf_extra.show_as_tree(maxdepth=2))" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "If the canonical top level category of hierarchical\n", "categorizations is defined, you can also calculate the level of a category in the\n", "hierarchy:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "print(climate_categories.IPCC2006[\"0\"].level)\n", "print(climate_categories.IPCC2006[\"1.A\"].level)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Pandas integration\n", "\n", "For each categorization, the categories are also available as a pandas\n", "DataFrame:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "climate_categories.IPCC2006.df" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Finding unknown codes\n", "\n", "Searching for a code in all included categorizations is possible\n", "using the `find_code` function:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "climate_categories.find_code(\"1A\")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "## Conversions\n", "\n", "### Included conversions\n", "\n", "You can get the rules for conversion between categories using\n", "the conversion objects:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "conv = climate_categories.IPCC1996.conversion_to(\"IPCC2006\")\n", "conv" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "The conversion object can be queried for metadata:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "print(conv.categorization_a)\n", "print(conv.categorization_b)\n", "print(conv.auxiliary_categorizations)\n", "print(conv.comment)\n", "print(conv.institution)\n", "print(conv.references)\n", "print(conv.version)\n", "print(conv.last_update)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "More importantly, the conversion object also holds the actual conversion rules:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "len(conv.rules)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "conv.rules[0]" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "For the rules, the most important parts are the categories and factors for each side and the specification for which auxiliary categories the rule is valid:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "rule = conv.rules[0]\n", "print(rule.factors_categories_a)\n", "print(rule.factors_categories_b)\n", "print(rule.auxiliary_categories)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "In this example, the category 1 of the IPCC1996 categorization equals exactly the category 1 of the IPCC2006 categorization (i.e. for both, the factor is unity) and the rule is valid for all gases (the set of gases is empty, meaning an unrestricted rule)." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Often, you might be interested in rules concerning a specific set of categories, which you can also fetch:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "one_or_two_rules = conv.relevant_rules(\n", " {climate_categories.IPCC1996[\"1\"], climate_categories.IPCC1996[\"2\"]}\n", ")\n", "\n", "for rule in one_or_two_rules:\n", " print(\"###\")\n", " print(rule.format_human_readable())" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here, we used the `format_human_readable()` function to get a nicely formatted output.\n", "You can see that for this conversion, the category 1 maps cleanly between the two categorizations, but for category 2, you need to combine categories 2 and 3 from IPCC1996 to get category 2 from IPCC2006." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Finding inconsistencies\n", "\n", "Dealing with large conversion rule sets can be daunting, so there are a few functions to help when checking conversions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# describe all of the conversion rules\n", "# We only print the start of the description because it is looong\n", "print(conv.describe_detailed()[:200] + \"…\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# Find potential over counting problems\n", "suspected_problems = conv.find_over_counting_problems()\n", "for p in suspected_problems:\n", " print(p)\n", " print()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "Note that `find_over_counting_problems` at the moment can't reliably detect all over counting problems and also some suspected problems might be fine under closer examination.\n", "Use this function only to generate hints for possible problems." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# Find unmapped categories\n", "missing_1996, missing_2006 = conv.find_unmapped_categories()\n", "# returns sets of missing categories\n", "# unfortunately, this conversion is not very complete:\n", "print(len(missing_1996))\n", "print(len(missing_2006))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# the sets just contain regular categories which were forgotten\n", "next(iter(missing_1996))" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Viewing the full conversion\n", "\n", "To view the full conversion, also consider directly loading the source CSVs from the `climate_categories/data/` folder in a spreadsheet program." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }