Master Project List

The master project list serves as the single source of truth, providing a comprehensive overview of all projects undertaken by UNDP.

An accurate and up-to-date master project list is crucial for effective performance monitoring and compliance. It enables stakeholders to track project progress, allocate resources effectively, and ensure transparency and accountability.

For instance, this enables the Performance App to show the list of projects that are linked to each CPD outcome for each Country Office:

Available Data

To create a reliable master project list, UNDP leverages multiple datasets that contain project-related information:

  • IATI_FINANCIALS data: This is the IATI Project List, a list of projects used for external reporting on the Transparency Portal.

  • UNDP_CPD_SP data: The CPD (Country Programming Document) Project lists the projects for which the COs have matched them to the CPD outcomes.

  • UNDP_PROJECTS data: This is the master dataset of all UNDP projects and includes budget data for each project, broken down by CPD outcome.

  • QA & SESP data: A list of projects that require QA (Quality Assurance) or SESP (Social and Environmental Screening Procedure).

Note that UNDP_CPD_SP and IATI_FINANCIALS do not list all the projects but are used to enrich the data within UNDP_PROJECTS.

The IATI data can be taken from the UNDP Data Warehouse with the following query:

select
    a.hq_co,
    a.bureau,
    -- a.rollup_ou,
    rollup_ou = case when a.BUSINESS_UNIT <> 'HQ' then a.BUSINESS_UNIT else a.rollup_ou end,
    -- a.PROJECT_ID,
    a.PROJECT_NUMBER,
    a.PROJECT_NAME,
    iati_compatibility_project = case
        -- converted projects include double zeroes upfront
        when a.PROJECT_NUMBER like '00%' then isNull(a.ATLAS_AWARD_NUMBER, '' /* a.PROJECT_NUMBER */)
        else a.PROJECT_NUMBER
    end
from UNDP_IATI.UNDP_PROJECTS a
join UNDP_IATI.IATI_FINANCIALS c on a.PROJECT_ID = c.PROJECT_ID
    and c.FUND_CATEGORY = 'PROGRAMME'
group by a.hq_co,
    a.bureau,
    -- a.rollup_ou,
    case when a.BUSINESS_UNIT <> 'HQ' then a.BUSINESS_UNIT else a.rollup_ou end,
    -- a.PROJECT_ID,
    a.PROJECT_NUMBER,
    a.PROJECT_NAME,
    case
        -- converted projects include double zeroes upfront
        when a.PROJECT_NUMBER like '00%' then isNull(a.ATLAS_AWARD_NUMBER, '' /* a.PROJECT_NUMBER */)
        else a.PROJECT_NUMBER
    end

Columns in IATI_FINANCIALS data
  • hq_co: Location type (Headquarters (HQ), Country Office (CO), or Regional Center (RC)).

  • bureau: Name of the specialized unit or division responsible.

  • rollup_ou: Code for the organizational unit.

  • rollup_ou_description: Name of the organizational unit.

  • PROJECT_ID: Identifier for the project.

  • PROJECT_NUMBER: Number assigned to the project.

  • TASK_ID: Identifier for the task.

  • TASK_NUMBER: Number assigned to the task.

  • FUND_CODE: Code representing the funding source.

  • FUND_CATEGORY: Category that the funding falls into.

  • DONOR: Code representing the donor.

  • DONOR_DESCR: Full name of the donor.

  • DONOR_DESCSHORT: Abbreviated description of the donor.

  • donor_type_lvl1: Classification of donor (Non-Government, Program city, Non-program city, or Other).

  • donor_type_lvl1_descr: Detailed description of the donor classification.

  • donor_type_lvl2: Secondary level name of donor.

  • donor_type_lvl2_descr: Description for the second level of donor classification.

  • donor_type_lvl3: Tertiary level name of donor.

  • donor_type_lvl3_descr: Description for the third level of donor classification.

  • fiscal_year: Fiscal year range (2023 to 2031).

  • Budget: Project budget amount in USD.

  • Expenditure: Total amount of money expended (includes negative numbers).

  • Load_Timestamp: Date and time of data entry.

Columns in UNDP_CPD_SP data
  • hq_co: Type of location (Headquarters (HQ), Country Office (CO), or Regional Center (RC)).

  • bureau: Name of the specialized unit or division responsible.

  • rollup_ou: Code of the organizational unit.

  • rollup_ou_description: Name of the organizational unit.

  • BUSINESS_UNIT: Identifier for the business unit.

  • PROJECT_ID: Identifier for the project.

  • PROJECT_NUMBER: Number assigned to the project.

  • TASK_ID: Identifier for the task.

  • TASK_NUMBER: Number assigned to the task.

  • TASK_NAME: Name of the task.

  • CPD_OUTPUT: Code for CPD output.

  • CPD_OUTPUT_DESCRIPTION: Description of CPD output.

  • CPD_OUTCOME: Code for CPD outcome.

  • CPD_OUTCOME_DESCRIPTION: Description of CPD outcome.

  • sp_outcome: SP outcome code.

  • sp_outcome_id: Identifier for SP outcome.

  • sp_outcome_description: Description of SP outcome.

  • sp_output: SP output code.

  • sp_output_description: Description of SP output.

  • SP_PRIMARY_RESULT_LINKAGE: Linkage to the primary result for SP projects.

  • SIGNATURE_SOLUTION: Code for signature solution.

  • SIGNATURE_SOLUTION_DESCRIPTION: Description of signature solution.

  • BUDGET_IDENTIFIER: Code for budget identifier.

  • BUDGET_IDENTIFIER_DESCRIPTION: Description of budget identifier.

  • CPD_CYCLE_NAME: Name of the CPD cycle.

  • CPD_CYCLE: Code for CPD cycle.

  • BEGIN_YEAR: Start year of the cycle.

  • END_YEAR: End year of the cycle.

Columns in UNDP_PROJECTS data

  • hq_co: Type of location (Headquarters (HQ), Country Office (CO), or Regional Center (RC)).

  • bureau: Name of the specialized unit or division responsible.

  • rollup_ou: Code of the organizational unit.

  • rollup_ou_description: Name of the organizational unit.

  • PROJECT_ID: Identifier for the project.

  • PROJECT_NUMBER: Number assigned to the project.

  • PROJECT_NAME: Name of the project.

  • PROJECT_DESCRIPTION: Description of the project.

  • ATLAS_AWARD_NUMBER: Atlas award number.

  • ATLAS_AWARD_DESCIPTION: Description of the Atlas award.

  • BUSINESS_UNIT: Identifier for the business unit.

  • ORGANIZATION: Organization involved in the project.

  • DEPARTMENT: Department associated with the project.

  • START_DATE: Start date of the project.

  • COMPLETION_DATE: Completion date of the project.

  • CLOSED_DATE: Date when the project was closed.

  • PROJECT_TYPE: Type of project.

  • PROJECT_TYPE_DESCRIPTION: Description of the project type.

  • PROJECT_STATUS: Status of the project.

  • PROJECT_MANAGER: Project manager's name.

  • PROJECT_MANAGER_EMAIL: Email address of the project manager.

  • IMPLEMENTING_PARTNER: Implementing partner code.

  • IMPLEMENTING_PARTNER_DESCRIPTION: Description of the implementing partner.

  • IMPLEMENTATION_MODALITY: Implementation modality code.

  • IMPLEMENTATION_MODALITY_DESCRIPTION: Description of the implementation modality.

  • PROJECT_ORIG_TEMPLATE: Original template of the project.

  • PROGRAMME_FUNDING_FLAG: Flag indicating program funding.

  • GEF_GCF_PROJECT_FLAG: Flag indicating GEF/GCF project.

  • Other_References_Value: Other references value.

Calculation

To get a list of projects with budgets for each project and by CPD outcome.

  1. Comparison of Project IDs in 3 datasets: Extract unique project IDs in 3 datasets. The aim is to find the total number of unique projects in each dataset and ensure that all projects within UNDP_CPD_SP and IATI_FINANCIALS can be found within UNDP_PROJECTS

  2. Checking UNDP_CPD_SP Dataset: Retrieve unique combinations of 'PROJECT_ID', 'CPD_OUTCOME', and 'TASK_ID': In UNDP_CPD_SP, select the columns 'PROJECT_ID', 'CPD_OUTCOME', and 'TASK_ID' for analysis. Then, drop duplicate rows based on unique combinations of 'PROJECT_ID', 'CPD_OUTCOME', and 'TASK_ID' and retain the first occurrence of each unique combination. Analyze the count of unique combinations of 'PROJECT_ID', 'CPD_OUTCOME', and 'TASK_ID' . The aim is to verify the dataset UNDP_CPD_SP contains only unique combinations of PROJECT_ID and CPD_OUTCOME and TASK_ID. Also, to verify the count of such unique combinations equal to number of rows of UNDP_CPD_SP dataset.

  3. Check Missing Values: Check in all 3 datasets whether there is any row with missing values for the columns 'PROJECT_ID', 'PROJECT_NUMBER', 'hq_co', 'bureau', 'rollup_ou', 'rollup_ou_description'.

  4. Display number of task ids with a budget value in IATI_FINANCIALS data.

  5. Merge Datasets: Utilize the common columns ['hq_co', 'bureau', 'rollup_ou', 'rollup_ou_description', 'PROJECT_ID', 'PROJECT_NUMBER'] to merge UNDP_PROJECTS with UNDP_CPD_SP. Perform a left join to retain all records from UNDP_PROJECTS while incorporating matching records from UNDP_CPD_SP. Later, utilize the same common columns to merge the previously merged dataframe with IATI_FINANCIALS. Again, perform a left join to retain all records from the previously merged dataframe while incorporating matching records from IATI_FINANCIALS.

  6. Specify columns to retain only the desired columns: Define a list of columns to be retained in the final dataframe based on project-related information and identifiers. After this, extract the selected columns from the merged dataframe to create the final Dataframe. Use the selected column list to filter the merged dataframe and retain only the desired columns.

  7. Checking all the rows with budget captured in merged data: Count the number of unique 'TASK_ID' values corresponding to rows with budget information. This is performed to provide insight into the completeness of budget data captured within the merged Dataframe. Also, we can verify previously displayed value regarding the number of task ids with a budget value in IATI_FINANCIALS data is same.

  8. Count unique projects: Count and display the number of unique combinations of 'PROJECT_ID' and 'CPD_OUTCOME' in the merged dataframe.

  9. Aggregation of Budget Data: A single project ID can have multiple CPD outcomes. Initially, the code defines an aggregation function to calculate the sum of the 'budget' column for each group defined by 'PROJECT_ID' and 'CPD_OUTCOME'. This step aggregates budget data across different outcomes for each project. To ensure that all unique 'PROJECT_ID' values are retained, a dataframe is created containing all unique 'PROJECT_ID's from the original dataset. The aggregated budget data is merged with the dataframe containing all unique 'PROJECT_ID' values. This step is important to ensure that even projects with no budgetary allocations or outcomes are included in the final analysis.

Project status

Print the counts of unique project IDs for each project status to show the distribution of projects across different statuses.

The statuses are:

  • Financially Closed

  • On Going

  • Operationally Closed

  • Submit for Operational Close

  • Submitted for Financial close

10. Filter data by specified project statuses:

Select rows from the dataframe where the 'PROJECT_STATUS' column matches the specified status values: 'On Going', 'Operationally Closed', and 'Submit for Operational Close'.

The items below were taken from previous SESP Methodology and must be included here

Filtering out Outputs: The initial step involves removing duplicate project records to adhere to the "Quantum" data management approach, transitioning from the "Atlas" method's one-row-per-project standard. This deduplication ensures each project is uniquely represented by eliminating redundancies based on specific attributes. These attributes include the project's operating unit, overseeing bureau, activity status, year of data entry, quality assurance eligibility and requirements, QA status, SESP requirements, SESP status, project identification number, and approval date. The aim is to maintain a dataset where each row uniquely represents a distinct project, setting the stage for precise analysis.

Identifying Unique Projects: After applying these filters, we count the number of unique projects by their ProjectNum_Unified identifier. This final step provides the total count of distinct projects meeting all the specified criteria.

Last updated