Skip to content

Objectives

BigQuery Manager has been developed with three core guiding principles: Visibility, Simplicity and Efficiency. These can also be considered as three stages of implementation and therefore corresponding levels of maturity.

Visibility

The first step is to map out the resources (tables, views, routines), activity and costs relating to each resource or set/type of resources. This is the foundational knowledge upon which subsequent activities are based. This resource metadata can then be augmented with real-time usage data to give a clear view of all resources and usage across multiple projects and organisations. This stage powers data capture for granular historical reporting and modelling, as well as the future monitoring strategy.

A more advanced analysis can also include connections between the resources and granular permissions to ensure all data is secured appropriately.

Simplicity

Any Data Warehouse tends towards chaos without conscious effort to the contrary. The first step in bringing order to the warehouse is simplification: identifying defunct resources, backing them up and deleting them is a good starting point. Standardisation of naming conventions and simplification of logical relationships and permissions is a subsequent, more complex stage to improve the structure of the data warehouse and the logic flowing through it.

Effiency

Date Warehouse performance and efficiency improvements result in faster speed and reduced cost, which is desirable in all situations. Improvements in visibility and simplicity will support these lower costs and increased performance, supporting targeted action as well as cost optimisation by modelling the impact of different pricing models on actual usage data.

Capabilities

These objectives are achieved via the following capabilities, enabled by the architecture outlined in the BigQuery Manager Architecture section.

flowchart LR

subgraph BQMANAGER CAPABILITIES
   subgraph ANALYSIS
        RESOURCE_MAPPING(["Resource Mapping"])
        DEPENDENCY_MAPPING(["Dependency Mapping"])
    end

    subgraph MONITORING
        ACCESS_MONITORING(["Access Monitoring"])
        COST_MONITORING(["Cost Monitoring"])
        COST_MODELLING(["Cost Modelling"])
        EXCEPTION_MONITORING(["Exception Monitoring"])
        INBOUND_DATA_MONITORING(["Inbound Data Monitoring"])
        STORAGE_MONITORING(["Storage Monitoring"])
        SCHEDULED_QUERY_MONITORING(["Scheduled Query Monitoring"])
        USAGE_MONITORING(["Usage Monitoring"])
    end

    subgraph IMPROVEMENT
        BACKUP_MANAGEMENT(["Backup Management"])
        CLEANUP_AUTOMATION(["Cleanup Automation"])
        COST_OPTIMISATION(["Cost Optimisation"])
        LOG_STORAGE(["Log Storage"])
    end  

end

Note that Dependency Mapping and notifications attached to all Monitoring capabilities are advanced features and require access to external APIs (SQLGlot and Slack respectively, deployed as BigQuery Remote Functions), in addition to usage of complex recursive queries to traverse the derived dependency tree.

Deployment

We can deploy the standard BQMANAGER capabilities to any us or eu located dataset, which can then be used on any number of projects in that region. We have a standard consulting enagement through which we deploy all of the resources, and then an ongoing licence to enable refresh of the BQMANAGER foundation tables. Note that there are often complex permissions challenges to navigate and resolve as part of this process. We can also deploy BQMANAGER to different geographic locations upon request.

Please contact bqmanager@transformationflow.io for further information. We also offer bespoke consulting services based on this library and specific desired analytical outcomes.