Master Project List
The master project list serves as the single source of truth, providing a comprehensive overview of all projects undertaken by UNDP.
Last updated
The master project list serves as the single source of truth, providing a comprehensive overview of all projects undertaken by UNDP.
Last updated
An accurate and up-to-date master project list is crucial for effective performance monitoring and compliance. It enables stakeholders to track project progress, allocate resources effectively, and ensure transparency and accountability.
For instance, this enables the Performance App to show the list of projects that are linked to each CPD outcome for each Country Office:
To create a reliable master project list, UNDP leverages multiple datasets that contain project-related information:
IATI_FINANCIALS data: This is the IATI Project List, a list of projects used for external reporting on the Transparency Portal.
UNDP_CPD_SP data: The CPD (Country Programming Document) Project lists the projects for which the COs have matched them to the CPD outcomes.
UNDP_PROJECTS data: This is the master dataset of all UNDP projects and includes budget data for each project, broken down by CPD outcome.
QA & SESP data: A list of projects that require QA (Quality Assurance) or SESP (Social and Environmental Screening Procedure).
Note that UNDP_CPD_SP and IATI_FINANCIALS do not list all the projects but are used to enrich the data within UNDP_PROJECTS.
The IATI data can be taken from the UNDP Data Warehouse with the following query:
To get a list of projects with budgets for each project and by CPD outcome.
Comparison of Project IDs in 3 datasets: Extract unique project IDs in 3 datasets. The aim is to find the total number of unique projects in each dataset and ensure that all projects within UNDP_CPD_SP and IATI_FINANCIALS can be found within UNDP_PROJECTS
Checking UNDP_CPD_SP Dataset: Retrieve unique combinations of 'PROJECT_ID', 'CPD_OUTCOME', and 'TASK_ID': In UNDP_CPD_SP, select the columns 'PROJECT_ID', 'CPD_OUTCOME', and 'TASK_ID' for analysis. Then, drop duplicate rows based on unique combinations of 'PROJECT_ID', 'CPD_OUTCOME', and 'TASK_ID' and retain the first occurrence of each unique combination. Analyze the count of unique combinations of 'PROJECT_ID', 'CPD_OUTCOME', and 'TASK_ID' . The aim is to verify the dataset UNDP_CPD_SP contains only unique combinations of PROJECT_ID and CPD_OUTCOME and TASK_ID. Also, to verify the count of such unique combinations equal to number of rows of UNDP_CPD_SP dataset.
Check Missing Values: Check in all 3 datasets whether there is any row with missing values for the columns 'PROJECT_ID', 'PROJECT_NUMBER', 'hq_co', 'bureau', 'rollup_ou', 'rollup_ou_description'.
Display number of task ids with a budget value in IATI_FINANCIALS data.
Merge Datasets: Utilize the common columns ['hq_co', 'bureau', 'rollup_ou', 'rollup_ou_description', 'PROJECT_ID', 'PROJECT_NUMBER'] to merge UNDP_PROJECTS with UNDP_CPD_SP. Perform a left join to retain all records from UNDP_PROJECTS while incorporating matching records from UNDP_CPD_SP. Later, utilize the same common columns to merge the previously merged dataframe with IATI_FINANCIALS. Again, perform a left join to retain all records from the previously merged dataframe while incorporating matching records from IATI_FINANCIALS.
Specify columns to retain only the desired columns: Define a list of columns to be retained in the final dataframe based on project-related information and identifiers. After this, extract the selected columns from the merged dataframe to create the final Dataframe. Use the selected column list to filter the merged dataframe and retain only the desired columns.
Checking all the rows with budget captured in merged data: Count the number of unique 'TASK_ID' values corresponding to rows with budget information. This is performed to provide insight into the completeness of budget data captured within the merged Dataframe. Also, we can verify previously displayed value regarding the number of task ids with a budget value in IATI_FINANCIALS data is same.
Count unique projects: Count and display the number of unique combinations of 'PROJECT_ID' and 'CPD_OUTCOME' in the merged dataframe.
Aggregation of Budget Data: A single project ID can have multiple CPD outcomes. Initially, the code defines an aggregation function to calculate the sum of the 'budget' column for each group defined by 'PROJECT_ID' and 'CPD_OUTCOME'. This step aggregates budget data across different outcomes for each project. To ensure that all unique 'PROJECT_ID' values are retained, a dataframe is created containing all unique 'PROJECT_ID's from the original dataset. The aggregated budget data is merged with the dataframe containing all unique 'PROJECT_ID' values. This step is important to ensure that even projects with no budgetary allocations or outcomes are included in the final analysis.
Print the counts of unique project IDs for each project status to show the distribution of projects across different statuses.
The statuses are:
Financially Closed
On Going
Operationally Closed
Submit for Operational Close
Submitted for Financial close
Select rows from the dataframe where the 'PROJECT_STATUS' column matches the specified status values: 'On Going', 'Operationally Closed', and 'Submit for Operational Close'.
The items below were taken from previous SESP Methodology and must be included here
Filtering out Outputs: The initial step involves removing duplicate project records to adhere to the "Quantum" data management approach, transitioning from the "Atlas" method's one-row-per-project standard. This deduplication ensures each project is uniquely represented by eliminating redundancies based on specific attributes. These attributes include the project's operating unit, overseeing bureau, activity status, year of data entry, quality assurance eligibility and requirements, QA status, SESP requirements, SESP status, project identification number, and approval date. The aim is to maintain a dataset where each row uniquely represents a distinct project, setting the stage for precise analysis.
Identifying Unique Projects: After applying these filters, we count the number of unique projects by their ProjectNum_Unified
identifier. This final step provides the total count of distinct projects meeting all the specified criteria.