How-To Tutorials

article-image-optimizing-graphics-pipelines-with-meshlets-a-guide-to-efficient-geometry-processing

09 Dec 2024

15 min read

Optimizing Graphics Pipelines with Meshlets: A Guide to Efficient Geometry Processing

09 Dec 2024

This article is an excerpt from the book, "Mastering Graphics Programming with Vulkan", by Marco Castorina, Gabriel Sassone. Mastering Graphics Programming with Vulkan starts by familiarizing you with the foundations of a modern rendering engine. This book will guide you through GPU-driven rendering and show you how to drive culling and rendering from the GPU to minimize CPU overhead. Finally, you’ll explore advanced rendering techniques like temporal anti-aliasing and ray tracing.IntroductionIn modern graphics pipelines, optimizing the geometry stage can have a significant impact on overall rendering performance. This article delves into the concept of meshlets—an approach to breaking down large meshes into smaller, more manageable chunks for efficient GPU processing. By subdividing meshes into meshlets, we can enhance culling techniques, reduce unnecessary shading, and better handle complex geometry. Join us as we explore how meshlets work, their benefits, and practical steps to implement them.Breaking down large meshes into meshletsIn this article, we are going to focus primarily on the geometry stage of the pipeline, the one before the shading stage. Adding some complexity to the geometry stage of the pipeline will pay dividends in later stages as we’ll reduce the number of pixels that need to be shaded.NoteWhen we refer to the geometry stage of the graphics pipeline, we don’t mean geometry shaders. Th e geometry stage of the pipeline refers to input assembly (IA), vertex processing, and primitive assembly (PA). Vertex processing can, in turn, run one or more of the following shaders: vertex, geometry, tessellation, task, and mesh shaders.Content geometry comes in many shapes, sizes, and complexity. A rendering engine must be able to deal with meshes from small, detailed objects to large terrains. Large meshes (think terrain or buildings) are usually broken down by artists so that the rendering engine can pick out the diff erent levels of details based on the distance from the camera of these objects.Breaking down meshes into smaller chunks can help cull geometry that is not visible, but some of these meshes are still large enough that we need to process them in full, even if only a small portion is visible.Meshlets have been developed to address these problems. Each mesh is subdivided into groups of vertices (usually 64) that can be more easily processed on the GPU.The following image illustrates how meshes can be broken down into meshlets:Figure 6.1 – A meshlet subdivision exampleThese vertices can make up an arbitrary number of triangles, but we usually tune this value according to the hardware we are running on. In Vulkan, the recommended value is 126 (as written in https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/, the number is needed to reserve some memory for writing the primitive count with each meshlet).NoteAt the time of writing, mesh and task shaders are only available on Nvidia hardware through its extension. While some of the APIs described in this chapter are specifi c to this extension, the concepts can be generally applied and implemented using generic compute shaders. A more generic version of this extension is currently being worked on by the Khronos committee so that mesh and task shaders should soon be available from other vendors!Now that we have a much smaller number of triangles, we can use them to have much finer-grained control by culling meshlets that are not visible or are being occluded by other objects.Together with the list of vertices and triangles, we also generate some additional data for each meshlet that will be very useful later on to perform back-face, frustum, and occlusion culling.One additional possibility (that will be added in the future) is to choose the level of detail (LOD) of a mesh and, thus, a different subset of meshlets based on any wanted heuristic.The first of this additional data represents the bounding sphere of a meshlet, as shown in the following screenshot:Figure 6.2 – A meshlet bounding spheres example; some of the larger spheres have been hidden for claritySome of you might ask: why not AABBs? AABBs require at least two vec3 of data: one for the center and one for the half-size vector. Another encoding could be to store the minimum and maximum corners. Instead, spheres can be encoded with a single vec4: a vec3 for the center plus the radius.Given that we might need to process millions of meshlets, each saved byte counts! Spheres can also be more easily tested for frustum and occlusion culling, as we will describe later in the chapter.The next additional piece of data that we’re going to use is the meshlet cone, as shown in the following screenshot:Figure 6.3 – A meshlet cone example; not all cones are displayed for clarityThe cone indicates the direction a meshlet is facing and will be used for back-face culling.Now we have a better understanding of why meshlets are useful and how we can use them to improve the culling of larger meshes, let’s see how we generate them in code!Generating meshletsWe are using an open source library, called MeshOptimizer (https://github.com/zeux/meshoptimizer) to generate the meshlets. An alternative library is meshlete (https:// github.com/JarkkoPFC/meshlete) and we encourage you to try both to find the one that best suits your needs.After we have loaded the data (vertices and indices) for a given mesh, we are going to generate the list of meshlets. First, we determine the maximum number of meshlets that could be generated for our mesh and allocate memory for the vertices and indices arrays that will describe the meshlets:const sizet max_meshlets = meshopt_buildMeshletsBound( indices_accessor.count, max_vertices, max_triangles ); Array<meshopt_Meshlet> local_meshlets; local_meshlets.init( temp_allocator, max_meshlets, max_meshlets ); Array<u32> meshlet_vertex_indices; meshlet_vertex_indices.init( temp_allocator, max_meshlets * max_vertices, max_meshlets* max_vertices ); Array<u8> meshlet_triangles; meshlet_triangles.init( temp_allocator, max_meshlets * max_triangles * 3, max_meshlets* max_triangles * 3 );Notice the types for the indices and triangle arrays. We are not modifying the original vertex or index buffer, but only generating a list of indices in the original buffers. Another interesting aspect is that we only need 1 byte to store the triangle indices. Again, saving memory is very important to keep meshlet processing efficient!The next step is to generate our meshlets:const sizet max_vertices = 64; const sizet max_triangles = 124; const f32 cone_weight = 0.0f; sizet meshlet_count = meshopt_buildMeshlets( local_meshlets.data, meshlet_vertex_indices.data, meshlet_triangles.data, indices, indices_accessor.count, vertices, position_buffer_accessor.count, sizeof( vec3s ), max_vertices, max_triangles, cone_weight );As mentioned in the preceding step, we need to tell the library the maximum number of vertices and triangles that a meshlet can contain. In our case, we are using the recommended values for the Vulkan API. The other parameters include the original vertex and index buffer, and the arrays we have just created that will contain the data for the meshlets.Let’s have a better look at the data structure of each meshlet:struct meshopt_Meshlet { unsigned int vertex_offset; unsigned int triangle_offset; unsigned int vertex_count; unsigned int triangle_count; };Each meshlet is described by two offsets and two counts, one for the vertex indices and one for the indices of the triangles. Note that these off sets refer to meshlet_vertex_indices and meshlet_ triangles that are populated by the library, not the original vertex and index buff ers of the mesh.Now that we have the meshlet data, we need to upload it to the GPU. To keep the data size to a minimum, we store the positions at full resolution while we compress the normals to 1 byte for each dimension and UV coordinates to half-float for each dimension. In pseudocode, this is as follows:meshlet_vertex_data.normal = ( normal + 1.0 ) * 127.0; meshlet_vertex_data.uv_coords = quantize_half( uv_coords );The next step is to extract the additional data (bounding sphere and cone) for each meshlet:for ( u32 m = 0; m < meshlet_count; ++m ) { meshopt_Meshlet& local_meshlet = local_meshlets[ m ]; meshopt_Bounds meshlet_bounds = meshopt_computeMeshletBounds( meshlet_vertex_indices.data + local_meshlet.vertex_offset, meshlet_triangles.data + local_meshlet.triangle_offset, local_meshlet.triangle_count, vertices, position_buffer_accessor .count, sizeof( vec3s ) ); ... }We loop over all the meshlets and we call the MeshOptimizer API that computes the bounds for each meshlet. Let’s see in more detail the structure of the data that is returned:struct meshopt_Bounds { float center[3]; float radius; float cone_apex[3]; float cone_axis[3]; float cone_cutoff; signed char cone_axis_s8[3]; signed char cone_cutoff_s8; };The first four floats represent the bounding sphere. Next, we have the cone definition, which is comprised of the cone direction (cone_axis) and the cone angle (cone_cutoff). We are not using the cone_apex value as it makes the back-face culling computation more expensive. However, it can lead to better results.Once again, notice that quantized values (cone_axis_s8 and cone_cutoff_s8) help us reduce the size of the data required for each meshlet.Finally, meshlet data is copied into GPU buff ers and it will be used during the execution of task and mesh shaders.For each processed mesh, we will also save an offset and count of meshlets to add a coarse culling based on the parent mesh: if the mesh is visible, then its meshlets will be added.In this article, we have described what meshlets are and why they are useful to improve the culling of geometry on the GPU.ConclusionMeshlets represent a powerful tool for optimizing the rendering of complex geometries. By subdividing meshes into small, efficient chunks and incorporating additional data like bounding spheres and cones, we can achieve finer-grained control over visibility and culling processes. Whether you're leveraging advanced shader technologies or applying these concepts with compute shaders, adopting meshlets can lead to significant performance improvements in your graphics pipeline. With libraries like MeshOptimizer at your disposal, implementing this technique has never been more accessible.Author BioMarco Castorina first became familiar with Vulkan while working as a driver developer at Samsung. Later, he developed a 2D and 3D renderer in Vulkan from scratch for a leading media server company. He recently joined the games graphics performance team at AMD. In his spare time, he keeps up to date with the latest techniques in real-time graphics. He also likes cooking and playing guitar.Gabriel Sassone is a rendering enthusiast currently working as a principal rendering engineer at The Multiplayer Group. Previously working for Avalanche Studios, where he first encountered Vulkan, they developed the Vulkan layer for the proprietary Apex Engine and its Google Stadia port. He previously worked at ReadyAtDawn, Codemasters, FrameStudios, and some other non-gaming tech companies. His spare time is filled with music and rendering, gaming, and outdoor activities.

0
0
70757

article-image-mastering-performance-tuning-with-dax-studio-and-vertipaq-analyzer

Thomas LeBlanc, Bhavik Merchant

03 Dec 2024

15 min read

Mastering Performance Tuning with DAX Studio and VertiPaq Analyzer

Thomas LeBlanc, Bhavik Merchant

03 Dec 2024

15 min read

This article is an excerpt from the book, "Microsoft Power BI Performance Best Practices - Second Edition", by Thomas LeBlanc, Bhavik Merchant. Overcome common challenges in data management, visualization, and security with this updated edition of Microsoft Power BI Performance Best Practices, and ramp-up your Power BI solutions, achieve faster insights, and drive better business outcomes.IntroductionOptimizing performance and storage in Power BI and Analysis Services can be a complex task. However, tools like DAX Studio and VertiPaq Analyzer simplify this process by providing insightful metrics and performance-tuning capabilities. This article explores how to leverage these tools to analyze semantic models, identify performance bottlenecks, and optimize DAX queries. We'll discuss key features such as viewing model metrics, capturing and analyzing query traces, and testing optimizations using DAX Studio's query editor.Tuning with DAX Studio and VertiPaq AnalyserDAX Studio, as the name implies, is a tool centered on DAX queries. It provides a simple yet intuitive interface with powerful features to browse and query Analysis Services semantic models. We will cover querying later in this section. For now, let’s look deeper into semantic models.The Analysis Services engine has supported dynamic management views (DMVs) for over a decade. These views refer to SQL-like queries that can be executed on Analysis Services to return information about semantic model objects and operations.VertiPaq Analyzer is a utility that uses publicly documented DMVs to display essential information about which structures exist inside the semantic model and how much space they occupy. It started life as a standalone utility, published as a Power Pivot for an Excel workbook, and still exists in that form today. In this chapter, we will refer to its more recent incarnation as a built-in feature of DAX Studio 3.0.11.It is interesting to note that VertiPaq is the original name given to the compressed column store engine within Analysis Services (Verti referring to columns and Paq referring to compression).Analyzing model size with VertiPaq AnalyzerVertiPaq Analyzer is built into DAX Studio as the View Metrics features, found in the Advanced tab of the toolbar. You simply click the icon to have DAX Studio run the DMVs for you and display statistics in a tabular form. This is shown in the following figure:Figure 6.8 – Using View Metrics to generate VertiPaq Analyzer statsYou can switch to the Summary tab of the VertiPaq Analyzer pane to get an idea of the overall total size of the model along with other summary statistics, as shown in the following figure:Figure 6.9 – Summary tab of VertiPaq AnalyzerThe Total Size metric provided in the previous figure will often be larger than the size of the semantic model on disk (as a .pbix file or Analysis Services .abf backup). This is because there are additional structures required when the model is loaded into memory, which is particularly true of Import mode semantic models.In Chapter 2, Exploring Power BI Architecture and Configuration, we learned about Power BI’s compressed column storage engine. The DMV statistics provided by VertiPaq Analyzer let us see just how compressible columns are and how much space they are taking up. It also allows us to observe other objects, such as relationships.The Columns tab is a great way to see whether you have any columns that are very large relative to others or the entire dataset. The following figure shows the columns view for the same model we saw in Figure 6.9. You can see how from 238 columns, a single column called SalesOrderNumber takes up a staggering 22.40% of the whole model size! It’s interesting to see its Cardinality (or uniqueness) value is about twelve times lower than the next largest column (SalesKey):|Figure 6.10 – Two columns monopolizing the semantic modelIn Figure 6.10, we can also see that Data Type is String for Online Sale-SalesOrderNumber, which was a column suggested by Tabular Editor to have a large dictionary footprint. These statistics would lead you to deduce that this column contains long, unique test values that do not compress well because there is a large cardinality. Indeed, in this case, the column contains a sales order number that is unique to each order plus is not used to group or slice analytical data in a Power BI report well.This analysis may lead you to re-evaluate the need for this level of reporting in the analysis of sales data. You’d need to ask yourself whether the extra storage space and time taken to build compressed columns and potentially other structures is worth it for your business case. In cases of highly detailed data such as this where you do not need detail-level sales order data, consider limiting the analysis to customer-related data such as demographics or date attributes such as year and month.Now, let’s learn about how DAX Studio can help us with performance analysis and improvement.Performance tuning the data model and DAXThe first-party option for capturing Analysis Services traces is SQL Server Profiler. When starting a trace, you must identify exactly which events to capture, which requires some knowledge of the trace events and what they contain. Even with this knowledge, working with the trace data in Profi ler can be tough since the tool was designed primarily to work with SQL Server application traces. The good news is that DAX Studio can start an Analysis Services server trace and then parse and format all the data to show you relevant results in a well-presented way within its user interface. It allows us to both tune and measure queries in a single place and provides features for Analysis Services that make it a good alternative SQL profiler for tuning semantic models.Capturing and replaying queriesThis All Queries command in the Traces section of the DAX Studio toolbar will start a trace against the semantic model you have connected to. Figure 6.11 shows the result when a trace is successfully started:Figure 6.11 – Query trace successfully started in DAX StudioOnce your trace has started, you can interact with the semantic model outside DAX Studio, and it will capture queries for you. How you interact with the semantic model depends on where it is. For a semantic model running on your computer in Power BI Desktop, you would simply interact with the report. This would generate queries that DAX Studio will see. The All Queries tab at the bottom of the tool is where the captured queries are listed in time order with durations in milliseconds. The following figure shows two queries captured when opening the Unique by Account No page from the Slow vs Fast Measures.pbix sample file:Figure 6.12 – Queries captured by DAX StudioThe preceding queries come from a screen that has the same table results in a visual, but two different DAX measures that calculate the aggregation. These measures make one table come back in less than a second while the other returns in about 17 seconds. The following figure shows the page in the report:Figure 6.13 – Tables with the same results but from using different measuresThe following screenshot shows the results of the Performance Analyzer for the tables previously.Observe how one query took over 17 seconds, whereas the other took under 1 second:Figure 6.14 – Vastly different query durations for the same visual resultIn Figure 6.12, the second query was double-clicked to bring the DAX text to the editor. You can modify this query in DAX Studio to test performance changes. We see here that the DAX expression for the UniqueRedProducts_Slow measure was not efficient. We’ll learn a technique to optimize queries soon, but first, we need to learn about capturing query performance traces.Obtaining query timingsTo get detailed query performance information, you can use the Server Timings command shown in Figure 6.11. After starting the trace, you can run queries and then use the Server Timings tab to see how the engine executed the query, as shown in the following figure:Figure 6.15 – Server Timings showing detailed query performance statisticsFigure 6.15 gives very useful information. FE and SE refer to the formula engine and storage engine. The storage engine is fast and multi-threaded, and its job is fetching data. It can apply basic logic such as filtering data to retrieve only what is needed. The formula engine is single-threaded, and it generates a query plan, which is the physical steps required to compute the result. It also performs calculations on the data such as joins, complex filters, aggregations, and lookups. We want to avoid queries that spend most of the time in the formula engine, or that execute many queries in the storage engine. The bottom-left section of Figure 6.15 shows that we executed almost 4,900 SE queries. The list of queries to the right shows many queries returning only one result, which is suspicious.For comparison, we look at timing for the fastest version of the query and we see the following:Figure 6.16 – Server Timings for a fast version of the queryIn Figure 6.16, we can see that only three server engine queries were run this time, and the result was obtained much faster (milliseconds compared to seconds).The faster DAX measure was as follows:UniqueRedProducts_Fast = CALCULATE( DISTINCTCOUNT('SalesOrderDetail'[ProductID]), 'Product'[Color] = "Red" )The slower DAX measure was as follows:UniqueRedProducts_Slow = CALCULATE( DISTINCTCOUNT('SalesOrderDetail'[ProductID]), FILTER('SalesOrderDetail', RELATED('Product'[Color]) = "Red"))TipThe Analysis Services engine does use data caches to speed up queries. These caches contain uncompressed query results that can be reused later to save time fetching and decompressing data. You should use the Clear Cache button in DAX Studio to force these caches to be cleared and get a proper worst-case performance measure. This is visible in the menu bar in Figure 6.11.We will build on these concepts when we look at DAX and model optimizations in later chapters. Now, let’s look at how we can experiment with DAX and query changes in DAX Studio.Modifying and tuning queriesEarlier in this section, we saw how we could capture a query generated by a Power BI visual and then display its text. A nice trick we can use here is to use query-scoped measures to override the measure definition and see how performance differs.The following figure shows how we can search for a measure, right-click, and then pull its definition into the query editor of DAX Studio:Figure 6.17 – The Define Measure option and result in the Query paneWe can now modify the measure in the query editor, and the engine will use the local definition instead of the one defined in the model! This technique gives you a fast way to prototype DAX enhancements without having to edit them in Power BI and refresh visuals over many iterations.Remember that this technique does not apply any changes to the dataset you are connected to. You can optimize expressions in DAX Studio, then transfer the definition to Power BI Desktop/Visual Studio when ready. The following figure shows how we changed the definition of UniqueRedProducts_ Slow in a query-scoped measure to get a huge performance boast:Figure 6.18 – Modified measure giving better resultsThe technique described here can be adapted to model changes too. For example, if you wanted to determine the impact of changing a relationship type, you could run the same queries in DAX Studio before and after the change to draw a comparison.Here are some additional tips for working with DAX Studio:Isolate measure: When performance tuning a query generated by a report visual, comment out complex measures and then establish a baseline performance score. Th en, add each measure back to the query individually and check the speed. This will help identify the slowest measures in the query and visual context.Work with Desktop Performance Analyzer traces: DAX Studio has a facility to import the trace files generated by Desktop Performance Analyzer. You can import trace files using the Load Perf Data button located next to All Queries highlighted in Figure 6.12. This trace can be captured by one person and then shared with a DAX/modeling expert who can use DAX Studio to analyze and replay their behavior. The following figure shows how DAX Studio formats the data to make it easy to see which visual component is taking the most time. It was generated by viewing each of the three report pages in the Slow vs Fast Measures.pbix sample file:Figure 6.19 – Performance Analyzer trace shows the slowest visual in the reportExport/import model metrics: DAX Studio has a facility to export or import the VertiPaq model metadata using .vpax files. These files do not contain any of your data. They contain table names, column names, and measure definitions. If you are not concerned with sharing these definitions, you can provide .vpax files to others if you need assistance with model optimizationConclusionDAX Studio and VertiPaq Analyzer are indispensable tools for anyone working with Power BI or Analysis Services models. From detailed model size analysis to advanced performance tuning, these tools empower users to identify inefficiencies and implement optimizations effectively. By using their robust features, such as the ability to view metrics, trace query performance, and prototype query changes, professionals can ensure their models are both efficient and scalable. Mastery of these tools lays a solid foundation for building high-performing, resource-efficient analytical solutions.Author BioThomas LeBlanc is a seasoned Business Intelligence Architect at Data on the Geaux, where he applies his extensive skillset in dimensional modeling, data visualization, and analytical modeling to deliver robust solutions. With a Bachelor of Science in Management Information Systems from Louisiana State University, Thomas has amassed over 30 years of experience in Information Technology, transitioning from roles as a software developer and database administrator to his current expertise in business intelligence and data warehouse architecture and management.Throughout his career, Thomas has spearheaded numerous impactful projects, including consulting for various companies on Power BI implementation, serving as lead database administrator for a major home health care company, and overseeing the implementation of Power BI and Analysis Service for a large bank. He has also contributed his insights as an author to the Power BI MVP book.Thomas is recognized as a Microsoft Data Platform MVP and is actively engaged in the tech community through his social media presence, notably as TheSmilinDBA on Twitter and ThePowerBIDude on Bluesky and Mastodon. With a passion for solving real-world business challenges with technology, Thomas continues to drive innovation in the field of business intelligence.Bhavik Merchant has nearly 18 years of deep experience in Business Intelligence. He is currently the Director of Product Analytics at Salesforce. Prior to that, he was at Microsoft, first as a Cloud Solution Architect and then as a Product Manager in the Power BI Engineering team. At Power BI, he led the customer-facing insights program, being responsible for the strategy and technical framework to deliver system-wide usage and performance insights to customers. Before Microsoft, Bhavik spent years managing high-caliber consulting teams delivering enterprise-scale BI projects. He has provided extensive technical and theoretical BI training over the years, including expert Power BI performance training he developed for top Microsoft Partners globally.

0
0
45776

article-image-enhancing-observability-with-azure-native-isv-services-and-third-party-integrations

José Ángel Fernández, Manuel Lázaro Ramírez

02 Dec 2024

15 min read

Enhancing Observability with Azure Native ISV Services and Third-Party Integrations

José Ángel Fernández, Manuel Lázaro Ramírez

02 Dec 2024

15 min read

This article is an excerpt from the book, "Cloud Observability with Azure Monitor", by José Ángel Fernández, Manuel Lázaro Ramírez. This book is your guide to understanding the dynamic landscape of cloud monitoring with Azure Monitor. You’ll gain practical insights into designing the monitoring strategies for your Azure resources with the help of examples and best practices.IntroductionAs organizations strive to maintain robust and comprehensive monitoring solutions, leveraging Azure Native ISV (Independent Software Vendor) services becomes increasingly valuable. These services are specifically designed to integrate seamlessly with Azure, providing enhanced monitoring, analytics, and management capabilities that complement Azure’s native tools. By incorporating ISV solutions, organizations can take advantage of specialized features, advanced analytics, and tailored monitoring capabilities that address unique business needs and operational requirements.In this article, we will explore the Azure Native ISV services available for monitoring. We’ll discuss the available service integration with Azure Monitor, their distinct advantages, and the added value they bring to your observability strategy. We will explore some of those services provided by Datadog, Elastic, Logz.io, Dynatrace, and New Relic. We’ll discuss the options these services provide to integrate with the Azure platform, as well as the benefits they offer.Azure Native DatadogAzure Native Datadog is a powerful, cloud-native monitoring and security platform that integrates seamlessly with Azure. Designed to provide comprehensive visibility into the health and performance of your applications and infrastructure, Datadog offers robust features such as real-time metrics, advanced analytics, and customizable dashboards. With Azure Native Datadog, organizations can monitor Azure resources alongside other cloud and on-premises environments, enabling a unified approach to observability.Datadog’s integration with Azure enables the automatic discovery and monitoring of Azure resources, including virtual machines, databases, and services. It provides real-time monitoring through continuous collection and analysis of metrics, logs, and traces from your Azure environment. It supports both IaaS and PaaS environments, thanks to its extensive integration with more than 40 services.Information collected can be used for advanced analytics and custom dashboards. You can utilize machine learning algorithms to detect anomalies and forecast trends, gain insights into application performance, and create detailed visualizations tailored to your specific needs, combining data from Azure and other sources.Security is also relevant, thanks to its alerting and incident management capabilities. Set up proactive alerts and manage incidents efficiently to minimize downtime and impact. Improve your security inside Azure through its Cloud Security management features.By leveraging Azure Native Datadog, organizations benefit from single-pane-of-glass visibility in hybrid and multi-cloud environments. Its costs are integrated into your Azure monthly bill directly, and access is transparent through the single sign-on integration.Metrics and activity log ingestion are automatically configured, and installation of the custom Datadog agents can be automated for your virtual machines. More information is available at https://learn.microsoft.com/en-us/azure/partnersolutions/datadog/create.Azure Native Elastic CloudAzure Native Elastic is an integrated solution that combines the power of Elasticsearch, Kibana, and other Elastic Stack components with Azure’s cloud capabilities. Elastic offers robust search, observability, and security solutions that help organizations gain deep insights into their Azure environments. By using Azure Native Elastic, you can seamlessly ingest, search, and visualize data from Azure resources, enabling advanced analytics and improved operational efficiency.Elastic’s integration with Azure provides a seamless experience for deploying and managing its CloudNative Observability Platform. It is provided as a Software-as-a-Service (SaaS) application through the Azure Marketplace, which centralizes log, metric, and trace analytics, simplifying the monitoring of Azure environments for Elastic clients.Users can manage Elastic solutions directly through the Azure portal, implementing monitoring for cloud workloads via a streamlined workflow. Provisioning Elastic resources is facilitated by a custom resource provider, allowing the creation, provisioning, and management of Elastic resources within Azure, with Elastic managing the SaaS application and associated accounts.It provides a similar experience to the previous solution through a single-pane-of-glass visibility platform, with a unified billing experience integrated into your Azure bill and transparent access to Elastic solutions through single sign-on integration. Metrics and activity log ingestion are automatically configured, and installation of the custom Elastic agents can be automated for your virtual machines.More information is available at https://learn.microsoft.com/en-us/azure/partnersolutions/elastic/create.Azure Native Logz.ioAzure Native Logz.io is a cloud-native observability platform that combines the best open-source tools – OpenSearch, OpenTelemetry, and Prometheus – in a unified solution. Logz.io provides advanced log management, metrics monitoring, and tracing capabilities, helping organizations achieve comprehensive observability across their Azure environments. With seamless integration and powerful analytics, Azure Native Logz.io enhances your ability to monitor and troubleshoot applications and infrastructure.Logz.io’s integration with Azure simplifies the deployment and management of observability tools. It is also provided as a SaaS application through the Azure Marketplace, which centralizes log, metric, and trace analytics. You can now provision the Logz.io resources through a custom resource provider that creates, provisions, and manages Logz.io resources through the Azure portal. Logz.io runs the SaaS, and Azure provides the interface to manage the resources.Azure Native Logz.io empowers organizations to enhance their observability strategy, ensuring the reliability and performance of their applications and infrastructure through integrated log, metric, and trace management.More information is available at https://learn.microsoft.com/en-us/azure/partnersolutions/logzio/create.Azure Native DynatraceAzure Native Dynatrace is a comprehensive observability platform designed to provide deep insights into the performance and health of your Azure applications and infrastructure. Dynatrace leverages artificial intelligence and automation to deliver precise answers, helping organizations optimize their operations and improve user experiences. With seamless Azure integration, Dynatrace offers monitoring capabilities across cloud and hybrid environments.Dynatrace’s integration with Azure enables the automatic discovery and monitoring of Azure resources, offering a rich set of features such as AI-driven monitoring, using AI to automatically detect anomalies, identify root causes, and predict potential issues, or full stack observability that monitors the entire stack, from infrastructure to applications, in real-time.Azure Native Dynatrace provides the same key benefits discussed in the previous solutions related to integration, billing, and automation of agent deployment and information collection.More information is available at https://learn.microsoft.com/en-us/azure/partnersolutions/dynatrace/dynatrace-create.Azure Native New RelicAzure Native New Relic is a powerful observability platform that offers comprehensive monitoring and analytics capabilities for your Azure applications and infrastructure. Designed to provide real-time visibility and actionable insights, New Relic integrates seamlessly with Azure, enabling organizations to monitor the performance and health of their environments with precision. By leveraging Azure Native New Relic, you can optimize application performance, enhance user experiences, and ensure operational excellence.New Relic’s integration with Azure allows effortless monitoring of Azure resources, featuring continuous monitoring of applications and infrastructure for real-time insights, powerful analytics to gain a deeper understanding of performance metrics and user behavior, custom dashboards to visualize key performance indicators and trends, and distributed tracing to track and analyze end-to-end transactions across distributed systems, helping you to identify performance bottlenecks.Adopting Azure Native New Relic provides the same key benefits discussed in the previous solutions related to integration, billing, and automation of agent deployment and information collection.You can learn more i nformation at https://learn.microsoft.com/en-us/azure/ partner-solutions/new-relic/new-relic-create.Additional third-party services for integrationIn addition to Azure Native ISV services, numerous third-party services also offer robust integration capabilities with Azure Monitor. These integrations extend the functionality of Azure Monitor, providing specialized features and advanced analytics that enhance your observability strategy. Leveraging these third-party services allows organizations to tailor their monitoring and security solutions to meet specific business needs, ensuring comprehensive visibility and control over their Azure environments.Those third-party services are as follows:IBM QRadar is a leading Security Information and Event Management (SIEM) solution that helps organizations detect and respond to security threats. Integrating QRadar with Azure Monitor allows you to centralize security event data from your Azure environment and gain deeper insights into potential security incidents. You can read more about it at https:// www.ibm.com/docs/en/qsip/7.5?topic=extensions-azure.Splunk is a powerful platform for searching, monitoring, and analyzing machine-generated data. Integrating Splunk with Azure Monitor enables you to collect, analyze, and visualize data from your Azure resources, enhancing your ability to monitor performance and detect issues. More information about this is available at https://splunk.github.io/splunkadd-on-for-microsoft-cloud-services/.Sumo Logic is a cloud-native, continuous intelligence platform for log management and analytics. Integrating Sumo Logic with Azure Monitor allows you to aggregate, monitor, and analyze log and metric data from your Azure resources, improving operational and security insights. More information i s available at https://help.sumologic.com/docs/ send-data/collect-from-other-data-sources/azure-monitoring/.ArcSight is a leading SIEM solution that provides advanced threat detection and response capabilities. Integrating ArcSight with Azure Monitor allows you to centralize security event data and gain actionable insights to protect your Azure environment. Read more about it at https://www.microfocus.com/documentation/arcsight/arcsightsmartconnectors/#gsc.tab=0.Syslog servers are a critical component of many IT infrastructures, providing centralized logging for network devices, servers, and applications. Integrating Syslog servers with Azure Monitor allows you to collect, store, and analyze Syslog data from your Azure environment, improving visibility and operational efficiency. Further information is available at https://learn. microsoft.com/en-us/azure/azure-monitor/agents/data-collectionsyslog.ConclusionAzure Native ISV services and third-party integrations provide organizations with a diverse set of tools to optimize observability, enhance operational efficiency, and address unique monitoring challenges. By leveraging these solutions, businesses can achieve comprehensive visibility across their Azure environments, enabling proactive management, improved performance, and robust security. Whether it's integrating Datadog for real-time analytics, Elastic for advanced search capabilities, or New Relic for deep performance insights, these services empower organizations to tailor their monitoring strategies and unlock the full potential of Azure.Author BioJosé Ángel Fernández has worked as a Microsoft Specialist and Cloud Solution Architect, specializing in advanced cloud migrations, with extensive technical expertise and a deep understanding of Azure solutions. He has been focused on the cloud for the last 11 years at Microsoft, starting at the same time virtual machines reached general availability and Azure Monitor was not yet a product.José Ángel graduated with a degree in telecommunications engineering from the Technical University of Madrid in 2013. He later earned a degree in big data analytics from the Graduate School of Engineering and Basic Sciences of Charles III University of Madrid in 2020.He resides in Madrid, Spain with his wife, his three-year-old child, and an adopted black cat that has never brought him bad luck.Manuel Lázaro Ramírez is a Microsoft Cloud Solution Architect with a wide technical breadth and deep understanding of Azure solutions. He has been focused on designing and implementing cloud architectures in different industries for the last 10 years.Manuel graduated with a degree in pure and applied mathematics from Complutense University of Madrid in 2013 and later earned a master’s degree in pure and applied mathematics from Complutense University of Madrid in 2014.He resides in Madrid, Spain, with his wife, and his passion is developing code with their friends and working and solving real-world business problems with cloud technology to deliver real value.

0
0
49007

article-image-managing-ai-security-risks-with-zero-trust-a-strategic-guide

Mark Simos, Nikhil Kumar

29 Nov 2024

15 min read

Managing AI Security Risks with Zero Trust: A Strategic Guide

Mark Simos, Nikhil Kumar

29 Nov 2024

15 min read

This article is an excerpt from the book, "Zero Trust Overview and Playbook Introduction", by Mark Simos, Nikhil Kumar. Get started on Zero Trust with this step-by-step playbook and learn everything you need to know for a successful Zero Trust journey with tailored guidance for every role, covering strategy, operations, architecture, implementation, and measuring success. This book will become an indispensable reference for everyone in your organization.IntroductionIn today’s rapidly evolving technological landscape, artificial intelligence (AI) is both a powerful tool and a significant security risk. Traditional security models focused on static perimeters are no longer sufficient to address AI-driven threats. A Zero Trust approach offers the agility and comprehensive safeguards needed to manage the unique and dynamic security risks associated with AI. This article explores how Zero Trust principles can be applied to mitigate AI risks and outlines the key priorities for effectively integrating AI into organizational security strategies.How can Zero Trust help manage AI security risk?A Zero Trust approach is required to effectively manage security risks related to AI. Classic network perimeter-centric approaches are built on more than 20-year-old assumptions of a static technology environment and are not agile enough to keep up with the rapidly evolving security requirements of AI.The following key elements of Zero Trust security enable you to manage AI risk:Data centricity: AI has dramatically elevated the importance of data security and AI requires a data-centric approach that can secure data throughout its life cycle in any location.Zero Trust provides this data-centric approach and the playbooks in this series guide the roles in your organizations through this implementation.Coordinated management of continuous dynamic risk: Like modern cybersecurity attacks, AI continuously disrupts core assumptions of business, technical, and security processes. This requires coordinated management of a complex and continuously changing security risk.Zero Trust solves this kind of problem using agile security strategies, policies, and architecture to manage the continuous changes to risks, tooling, processes, skills, and more. The playbooks in this series will help you make AI risk mitigation real by providing specific guidance on AI security risks for all impacted roles in the organization. Let’s take a look at which specific elements of Zero Trust are most important to managing AI risk.Zero Trust – the top four priorities for managing AI riskManaging AI risk requires prioritizing a few key areas of Zero Trust to address specific unique aspects of AI. The role of specific guidance in each playbook provides more detail on how each role will incorporate AI considerations into their daily work.These priorities follow the simple themes of learn it, use it, protect against it, and work as a team. This is similar to a rational approach for any major disruptive change to any other type of competition or conflict (a military organization learning about a new weapon, professional sports players learning about a new type of equipment or rule change, and so on).The top four priorities for managing AI risk are as follows:1. Learn it – educate everyone and set realistic expectations: The AI capabilities available today are very powerful, affect everyone, and are very different than what people expect them to be. It’s critical to educate every role in the organization, from board members and CEOs to individual contributors, as they all must understand what AI is, what AI really can and cannot do, as well as the AI usage policy and guidelines. Without this, people’s expectations may be wildly inaccurate and lead to highly impactful mistakes that could have easily been avoided.Education and expectation management is particularly urgent for AI because of these factors:Active use in attacks: Attackers are already using AI to impersonate voices, email writing styles, and more.Active use in business processes: AI is freely available for anyone to use. Job seekers are already submitting AI-generated resumes for your jobs that use your posted job descriptions, people are using public AI services to perform job tasks (and potentially disclosing sensitive information), and much more.Realism: The results are very realistic and convincing, especially if you don’t know how good AI is at creating fake images, videos, and text.How can Zero Trust help manage AI security risk?Confusion: Many people don’t have a good frame of reference for it because of the way AI has been portrayed in popular culture (which is very different from the current reality of AI).2. Use it – integrate AI into security: Immediately begin evaluating and integrating AI into your security tooling and processes to take advantage of their increased effectiveness and efficiency. This will allow you to quickly take advantage of this powerful technology to better manage security risk. AI will impact nearly every part of security, including the following:Security risk discovery, assessment, and management processesThreat detection and incident response processesArchitecture and engineering security defensesIntegrating security into the design and operation of systems…and many more3. Protect against it – update the security strategy, policy, and controls: Organizations must urgently update their strategy, policy, architecture, controls, and processes to account for the use of AI technology (by business units, technology teams, security teams, attackers, and more). This helps enable the organization to take full advantage of AI technology while minimizing security risk.The key focus areas should include the following:Plan for attacker use of AI: One of the first impacts most organizations will experience is rapid adoption by attackers to trick your people. Attackers are using AI to get an advantage on target organizations like yours, so you must update your security strategy, threat models, architectures, user education, and more to defend against attackers using AI or targeting you for your data. This should change the organization’s expectations and assumptions for the following aspects:Attacker techniques: Most attackers will experiment with and integrate AI capabilities into their attacks, such as imitating the voices of your colleagues on phone calls, imitating writing styles in phishing emails, creating convincing fake social media pictures and profiles, creating convincing fake company logos and profiles, and more.Attacker objectives: Attackers will target your data, AI systems, and other related assets because of their high value (directly to the attacker and/or to sell it to others). Your human-generated data is a prized high-value asset for training and grounding AI models and your innovative use of AI may be potentially valuable intellectual property, and more.Secure the organization’s AI usage: The organization must update its security strategy, plans, architecture, processes, and tooling to do the following:Secure usage of external AI: Establish clear policies and supporting processes and technology for using external AI systems safelySecure the organization’s AI and related systems: Protect the organization’s AI and related systems against attackersIn addition to protecting against traditional security attacks, the organization will also need to defend against AI-specific attack techniques that can extract source data, make the model generate unsafe or unintended results, steal the design of the AI model itself, and more. The playbooks include more details for each role to help them manage their part of this risk.Take a holistic approach: It’s important to secure the full life cycle and dependencies of the AI model, including the model itself, the data sources used by the model, the application that uses the model, the infrastructure it’s hosted on, third-party operators such as AI platforms, and other integrated components. This should also take a holistic view of the security life cycle to consider identification, protection, detection, response, recovery, and governance.Update acquisition and approval processes: This must be done quickly to ensure new AI technology (and other technology) meets the security, privacy, and ethical practices of the organization. This helps avoid extremely damaging avoidable problems such as transferring ownership of the organization’s data to vendors and other parties. You don’t want other organizations to grow and capture market share from you by using your data. You also want to avoid expensive privacy incidents and security incidents from attackers using your data against you.This should include supply chain risk considerations to mitigate both direct suppliers and Nth party risk (components of direct suppliers that have been sourced from other organizations). Finding and fixing problems later in the process is much more difficult and expensive than correcting them before or during acquisition, so it is critical to introduce these risk mitigations early.4. Work as a team – establish a coordinated AI approach: Set up an internal collaboration community or a formal Center of Excellence (CoE) team to ensure insights, learning, and best practices are being shared rapidly across teams. AI is a fast-moving space and will drive rapid continuous changes across business, technology, and security teams. You must have mechanisms in place to coordinate and collaborate across these different teams in your organization.How will AI impact Zero Trust?Each playbook describes the specific AI impacts and responsibilities for each affected role.AI shared responsibility model: Most AI technology will be a partnership with AI providers, so managing AI and AI security risk will follow a shared responsibility model between you and your AI providers. Some elements of AI security will be handled by the AI provider and some will be the responsibility of your organization (their customer).This is very similar to how cloud responsibility is managed today (and many AI providers are also cloud providers). This is also similar to a business that outsources some or all of its manufacturing, logistics, sales (for example, channel sales), or other business functions.Now, let’s take a look at how AI impacts Zero Trust.How will AI impact Zero Trust?AI will accelerate many aspects of Zero Trust because it dramatically improves the security tooling and people’s ability to use it. AI promises to reduce the burden and effort for important but tedious security tasks such as the following:Helping security analysts quickly query many data sources (without becoming an expert in query languages or tool interfaces)Helping writing incident response reportsIdentifying common follow-up actions to prevent repeat incidentSimplifying the interface between people and the complex systems they need to use for security will enable people with a broad range of skills to be more productive. Highly skilled people will be able to do more of what they are best at without repetitive and distracting tasks. People earlier in their careers will be able to quickly become more productive in a role, perform tasks at an expert level more quickly, and help them learn by answering questions and providing explanations.AI will NOT replace the need for security experts, nor the need to modernize security. AI will simplify many security processes and will allow fewer security people to do more, but it won’t replace the need for a security mindset or security expertise.Even with AI technology, people and processes will still be required for the following aspects:Ask the right security questions from AI systemsInterpret the results and evaluate their accuracyTake action on the AI results and coordinate across teamsPerform analysis and tasks that AI systems currently can’t cover:Identify, manage, and measure security risk for the organizationBuild, execute, and monitor a strategy and policyBuild and monitor relationships and processes between teamsIntegrate business, technical, and security capabilitiesEvaluate compliance requirements and ensure the organization is meeting them in good faithEvaluate the security of business and technical processesEvaluate the security posture and prioritize mitigation investmentsEvaluate the effectiveness of security processes, tools, and systemsPlan and implement security for technical systemsPlan and implement security for applications and productsRespond to and recover from attacksIn summary, AI will rapidly transform the attacks you face as well as your organization’s ability to manage security risk effectively. AI will require a Zero Trust approach and it will also help your teams do their jobs faster and more efficiently.The guidance in the Zero Trust Playbook Series will accelerate your ability to manage AI risk by guiding everyone through their part. It will help you rapidly align security to business risks and priorities and enable the security agility you need to effectively manage the changes from AI.Some of the questions that naturally come up are where to start and what to do first.ConclusionAs AI reshapes the cybersecurity landscape, adopting a Zero Trust framework is critical to effectively manage the associated risks. From securing data lifecycles to adapting to dynamic attacker strategies, Zero Trust principles provide the foundation for agile and robust AI risk management. By focusing on education, integration, protection, and collaboration, organizations can harness the benefits of AI while mitigating its risks. The Zero Trust Playbook Series offers practical guidance for all roles, ensuring security remains aligned with business priorities and prepared for the challenges AI introduces. Now is the time to embrace this transformative approach and future-proof your security strategies.Author BioMark Simos helps individuals and organizations meet cybersecurity, cloud, and digital transformation goals. Mark is the Lead Cybersecurity Architect for Microsoft where he leads the development of cybersecurity reference architectures, strategies, prescriptive planning roadmaps, best practices, and other security and Zero Trust guidance. Mark also co-chairs the Zero Trust working group at The Open Group and contributes to open standards and other publications like the Zero Trust Commandments. Mark has presented at numerous conferences including Black Hat, RSA Conference, Gartner Security & Risk Management, Microsoft Ignite and BlueHat, and Financial Executives International.Nikhil Kumar is Founder at ApTSi with prior leadership roles at Price Waterhouse and other firms. He has led setup and implementation of Digital Transformation and enterprise security initiatives (such as PCI Compliance) and built out Security Architectures. An Engineer and Computer Scientist with a passion for biology, Nikhil is an expert in Security, Information, and Computer Architecture. Known for communicating to the board and implementing with engineers and architects, he is an MIT mentor, innovator and pioneer. Nikhil has authored numerous books, standards, and articles, and presented at conferences globally. He co-chairs The Zero Trust Working Group, a global standards initiative led by the Open Group.

5
0
69798

article-image-mastering-transfer-learning-fine-tuning-bert-and-vision-transformers

Sinan Ozdemir

27 Nov 2024

15 min read

Mastering Transfer Learning: Fine-Tuning BERT and Vision Transformers

Sinan Ozdemir

27 Nov 2024

15 min read

DataPro is a weekly, expert-curated newsletter trusted by 120k+ global data professionals. Built by data practitioners, it blends first-hand industry experience with practical insights and peer-driven learning.Make sure to subscribe here so you never miss a key update in the data world. This article is an excerpt from the book, "Principles of Data Science", by Sinan Ozdemir. This book provides an end-to-end framework for cultivating critical thinking about data, performing practical data science, building performant machine learning models, and mitigating bias in AI pipelines. Learn the fundamentals of computational math and stats while exploring modern machine learning and large pre-trained models.IntroductionTransfer learning (TL) has revolutionized the field of deep learning by enabling pre-trained models to adapt their broad, generalized knowledge to specific tasks with minimal labeled data. This article delves into TL with BERT and GPT, demonstrating how to fine-tune these advanced models for text classification and image classification tasks. Through hands-on examples, we illustrate how TL leverages pre-trained architectures to simplify complex problems and achieve high accuracy with limited data.TL with BERT and GPTIn this article, we will take some models that have already learned a lot from their pre-training and fine-tune them to perform a new, related task. This process involves adjusting the model’s parameters to better suit the new task, much like fine-tuning a musical instrument:Figure 12.8 – ITLITL takes a pre-trained model that was generally trained on a semi-supervised (or unsupervised) task and then is given labeled data to learn a specific task.Examples of TLLet’s take a look at some examples of TL with specific pre-trained models.Example – Fine-tuning a pre-trained model for text classificationConsider a simple text classification problem. Suppose we need to analyze customer reviews and determine whether they’re positive or negative. We have a dataset of reviews, but it’s not nearly large enough to train a deep learning (DL) model from scratch. We will fine-tune BERT on a text classification task, allowing the model to adapt its existing knowledge to our specific problem.We will have to move away from the popular scikit-learn library to another popular library called transformers, which was created by HuggingFace (the pre-trained model repository I mentioned earlier) as scikit-learn does not (yet) support Transformer models.Figure 12.9 shows how we will have to take the original BERT model and make some minor modifications to it to perform text classification. Luckily, the transformers package has a built-in class to do this for us called BertForSequenceClassification:Figure 12.9 – Simplest text classification caseIn many TL cases, we need to architect additional layers. In the simplest text classification case, we add a classification layer on top of a pre-trained BERT model so that it can perform the kind of classification we want.The following code block shows an end-to-end code example of fine-tuning BERT on a text classification task. Note that we are also using a package called datasets, also made by HuggingFace, to load a sentiment classification task from IMDb reviews. Let’s begin by loading up the dataset:# Import necessary libraries from datasets import load_dataset from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments # Load the dataset imdb_data = load_dataset('imdb', split='train[:1000]') # Loading only 1000 samples for a toy example # Define the tokenizer tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') # Preprocess the data def encode(examples): return tokenizer(examples['text'], truncation=True, padding='max_ length', max_length=512) imdb_data = imdb_data.map(encode, batched=True) # Format the dataset to PyTorch tensors imdb_data.set_format(type='torch', columns=['input_ids', 'attention_ mask', 'label'])With our dataset loaded up, we can run some training code to update our BERT model on our labeled data:# Define the model model = BertForSequenceClassification.from_pretrained( 'bert-base-uncased', num_labels=2) # Define the training arguments training_args = TrainingArguments( output_dir='./results', num_train_epochs=1, per_device_train_batch_size=4 ) # Define the trainer trainer = Trainer(model=model, args=training_args, train_dataset=imdb_ data) # Train the model trainer.train() # Save the model model.save_pretrained('./my_bert_model')Once we have our saved model, we can use the following code to run the model against unseen data:from transformers import pipeline # Define the sentiment analysis pipeline nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer) # Use the pipeline to predict the sentiment of a new review review = "The movie was fantastic! I enjoyed every moment of it." result = nlp(review) # Print the result print(f"label: {result[0]['label']}, with score: {round(result[0] ['score'], 4)}") # "The movie was fantastic! I enjoyed every moment of it." # POSITIVE: 99%Example – TL for image classificationWe could take a pre-trained model such as ResNet or the Vision Transformer (shown in Figure 12.10), initially trained on a large-scale image dataset such as ImageNet. This model has already learned to detect various features from images, from simple shapes to complex objects. We can take advantage of this knowledge, fi ne-tuning the model on a custom image classification task:Figure 12.10 – The Vision TransformerThe Vision Transformer is like a BERT model for images. It relies on many of the same principles, except instead of text tokens, it uses segments of images as “tokens” instead.The following code block shows an end-to-end code example of fine-tuning the Vision Transformer on an image classification task. The code should look very similar to the BERT code from the previous section because the aim of the transformers library is to standardize training and usage of modern pre-trained models so that no matter what task you are performing, they can offer a relatively unified training and inference experience.Let’s begin by loading up our data and taking a look at the kinds of images we have (seen in Figure 12.11). Note that we are only going to use 1% of the dataset to show that you really don’t need that much data to get a lot out of pre-trained models!# Import necessary libraries from datasets import load_dataset from transformers import ViTImageProcessor, ViTForImageClassification from torch.utils.data import DataLoader import matplotlib.pyplot as plt import torch from torchvision.transforms.functional import to_pil_image # Load the CIFAR10 dataset using Hugging Face datasets # Load only the first 1% of the train and test sets train_dataset = load_dataset("cifar10", split="train[:1%]") test_dataset = load_dataset("cifar10", split="test[:1%]") # Define the feature extractor feature_extractor = ViTImageProcessor.from_pretrained('google/vitbase-patch16-224') # Preprocess the data def transform(examples): # print(examples) # Convert to list of PIL Images examples['pixel_values'] = feature_ extractor(images=examples["img"], return_tensors="pt")["pixel_values"] return examples # Apply the transformations train_dataset = train_dataset.map( transform, batched=True, batch_size=32 ).with_format('pt') test_dataset = test_dataset.map( transform, batched=True, batch_size=32 ).with_format('pt')We can similarly use the model using the following code:Figure 12.11 – A single example from CIFAR10 showing an airplaneNow, we can train our pre-trained Vision Transformer:# Define the model model = ViTForImageClassification.from_pretrained( 'google/vit-base-patch16-224', num_labels=10, ignore_mismatched_sizes=True ) LABELS = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'] model.config.id2label = LABELS # Define a function for computing metrics def compute_metrics(p): predictions, labels = p preds = np.argmax(predictions, axis=1) return {"accuracy": accuracy_score(labels, preds)} # Define the training arguments training_args = TrainingArguments( output_dir='./results', num_train_epochs=5, per_device_train_batch_size=4, load_best_model_at_end=True, # Save and evaluate at the end of each epoch evaluation_strategy='epoch', save_strategy='epoch' ) # Define the trainer trainer = Trainer( model=model, args=training_args, train_dataset=train_dataset, eval_dataset=test_dataset )Our final model has about 95% accuracy on 1% of the test set. We can now use our new classifier on unseen images, as in this next code block:from PIL import Image from transformers import pipeline # Define an image classification pipeline classification_pipeline = pipeline( 'image-classification', model=model, feature_extractor=feature_extractor ) # Load an image image = Image.open('stock_image_plane.jpg') # Use the pipeline to classify the image result = classification_pipeline(image)Figure 12.12 shows the result of this single classification, and it looks like it did pretty well:Figure 12.12 – Our classifier predicting a stock image of a plane correctlyWith minimal labeled data, we can leverage TL to turn models off the shelf into powerhouse predictive models.ConclusionTransfer learning is a transformative technique in deep learning, empowering developers to harness the power of pre-trained models like BERT and the Vision Transformer for specialized tasks. From sentiment analysis to image classification, these models can be fine-tuned with minimal labeled data, offering impressive performance and adaptability. By using libraries like HuggingFace’s transformers, TL streamlines model training, making state-of-the-art AI accessible and versatile across domains. As demonstrated in this article, TL is not only efficient but also a practical way to achieve powerful predictive capabilities with limited resources.Author BioSinan is an active lecturer focusing on large language models and a former lecturer of data science at the Johns Hopkins University. He is the author of multiple textbooks on data science and machine learning including "Quick Start Guide to LLMs". Sinan is currently the founder of LoopGenius which uses AI to help people and businesses boost their sales and was previously the founder of the acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. He holds a Master’s Degree in Pure Mathematics from Johns Hopkins University and is based in San Francisco.

0
0
38736

article-image-airflow-ops-best-practices-observation-and-monitoring

Dylan Intorf, Kendrick van Doorn, Dylan Storey

12 Nov 2024

15 min read

Airflow Ops Best Practices: Observation and Monitoring

Dylan Intorf, Kendrick van Doorn, Dylan Storey

12 Nov 2024

15 min read

This article is an excerpt from the book, "Apache Airflow Best Practices", by Dylan Intorf, Kendrick van Doorn, Dylan Storey. With practical approach and detailed examples, this book covers newest features of Apache Airflow 2.x and it's potential for workflow orchestration, operational best practices, and data engineering.IntroductionIn this article, we will continue to explore the application of modern “ops” practices within Apache Airflow, focusing on the observation and monitoring of your systems and DAGs after they’ve been deployed.We’ll divide this observation into two segments – the core Airflow system and individual DAGs. Each segment will cover specific metrics and measurements you should be monitoring for alerting and potential intervention.When we discuss monitoring in this section, we will consider two types of monitoring – active and suppressive.In an active monitoring scenario, a process will actively check a service’s health state, recording its state and potentially taking action directly on the return value.In a suppressive monitoring scenario, the absence of a state (or state change) is usually meaningful. In these scenarios, the monitored application sends an active schedule to a process to inform it that it is OK, usually suppressing an action (such as an alert) from occurring.This chapter covers the following topics:Monitoring core Airflow componentsMonitoring your DAGsTechnical requirementsBy now, we expect you to have a good understanding of Airflow and its core components, along with functional knowledge in the deployment and operation of Airflow and Airflow DAGs.We will not be covering specific observability aggregators or telemetry tools; instead, we will focus on the activities you should be keeping an eye on. We strongly recommend that you work closely with your ops teams to understand what tools exist in your stack and how to configure them for capture and alerting your deployments.Monitoring core Airflow componentsAll of the components we will discuss here are critical to ensuring a functioning Airflow deployment. Generally, all of them should be monitored with a bare minimum check of Is it on? and if a component is not, an alert should surface to your team for investigation. The easiest way to check this is to query the REST API on the web server at `/health/`; this will return a JSON object that can be parsed to determine whether components are healthy and, if not, when they were last seen.SchedulerThis component needs to be running and working effectively in order for tasks to be scheduled for execution.When the scheduler service is started, it also starts a `/health` endpoint that can be checked by an external process with an active monitoring approach.The returned signal does not always indicate that the scheduler is working properly, as its state is simply indicative that the service is up and running. There are many scenarios where the scheduler may be operating but unable to schedule jobs; as a result, many deployments will include a canary dag to their deployment that has a single task, acting to suppress an external alert from going off.Import metrics that airflow exposes for you include the following:scheduler.scheduler_loop_duration: This should be monitored to ensure that your scheduler is able to loop and schedule tasks for execution. As this metric increases, you will see tasks beginning to schedule more slowly, to the point where you may begin missing SLAs because tasks fail to reach a schedulable state.scheduler.tasks.starving: This indicates how many tasks cannot be scheduled because there are no slots available. Pools are a mechanism that Airflow uses to balance large numbers of submitted task executions versus a finite amount of execution throughput. It is likely that this number will not be zero, but being high for extended periods of time may point to an issue in how DAGs are being written to schedule work.scheduler.tasks.executable: This indicates how many tasks are ready for execution (i.e., queued). This number will sometimes not be zero, and that is OK, but if the number increases and stays high for extended periods of time, it indicates that you may need additional computer resources to handle the load. Look at your executor to increase the number of workers it can run. Metadata databaseThe metadata database is used to store and track all of the metadata for your Airflow deployments’ previous DAG/task executions, along with information about your environment’s roles and permissions. Losing data from this database can interrupt normal operations and cause unintended consequences, with DAG runs being repeated.While critical, because it is architecturally ubiquitous, the database is also least likely to encounter issues, and if it does, they are absolutely catastrophic in nature.We generally suggest you utilize a managed service for provisioning and operating your backing database, ensuring that a disaster recovery plan for your metadata database is in place at all times.Some active areas to monitor on your database include the following:Connection pool size/usage: Monitor both the connection pool size and usage over time to ensure appropriate configuration, and identify potential bottlenecks or resource contention arising from Airflow components’ concurrent connections.Query performance: Measure query latency to detect inefficient queries or performance issues, while monitoring query throughput to ensure effective workload handling by the database.Storage metrics: Monitor the disk space utilization of the metadata database to ensure that it has sufficient storage capacity. Set up alerts for low disk space conditions to prevent database outages due to storage constraints.Backup status: Monitor the status of database backups to ensure that they are performed regularly and successfully. Verify backup integrity and retention policies to mitigate the risk of data loss if there is a database failure.TriggererThe Triggerer instance manages all of the asynchronous operations of deferrable operators in a deferred state. As such, major operational concerns generally relate to ensuring that individual deferred operators don’t cause major blocking calls to the event loop. If this occurs, your deferrable tasks will not be able to check their state changes as frequently, and this will impact scheduling performance.Import metrics that airflow exposes for you include the following:triggers.blocked_main_thread: The number of triggers that have blocked the main thread. This is a counter and should monotonically increase over time; pay attention to large differences between recording (or quick acceleration) counts, as it’s indicative of a larger problem.triggers.running: The number of triggers currently on a triggerer instance. This metric should be monitored to determine whether you need to increase the number of triggerer instances you are running. While the official documentation claims that up to tens of thousands of triggers can be on an instance, the common operational number is much lower. Tune at your discretion, but depending on the complexity of your triggers, you may need to add a new instance for every few hundred consistent triggers you run.Executors/workersDepending on the executor you use, you will need to monitor your executors and workers a bit differently.The Kubernetes executor will utilize the Kubernetes API to schedule tasks for execution; as such, you should utilize the Kubernetes events and metrics servers to gather logs and metrics for your task instances. Common metrics to collect on an individual task are CPU and memory usage. This is crucial for tuning requests or mutating individual task resource requests to ensure that they execute safely.The Celery worker has additional components and long-lived processes that you need to metricize. You should monitor an individual Celery worker’s memory and CPU utilization to ensure that it is not over- or under-provisioned, tuning allocated resources accordingly. You also need to monitor the message broker (usually Redis or RabbitMQ) to ensure that it is appropriately sized. Finally, it is critical to measure the queue length of your message broker and ensure that too much “back pressure” isn’t being created in the system. If you find that your tasks are sitting in a queued state for a long period of time and the queue length is consistently growing, it’s a sign that you should start an additional Celery worker to execute on scheduled tasks. You should also investigate using the native Celery monitoring tool Flower (https://flower.readthedocs.io/en/latest/) for additional, more nuanced methods of monitoring.Web serverThe Airflow web server is the UI for not just your Airflow deployment but also the RESTful interface. Especially if you happen to be controlling Airflow scheduling behavior with API calls, you should keep an eye on the following metrics:Response time: Measure the time taken for the API to respond to requests. This metric indicates the overall performance of the API and can help identify potential bottlenecks.Error rate: Monitor the rate of errors returned by the API, such as 4xx and 5xx HTTP status codes. High error rates may indicate issues with the API implementation or underlying systems.Request rate: Track the rate of incoming requests to the API over time. Sudden spikes or drops in request rates can impact performance and indicate changes in usage patterns.System resource utilization: Monitor resource utilization metrics such as CPU, memory, disk I/O, and network bandwidth on the servers hosting the API. High resource utilization can indicate potential performance bottlenecks or capacity limits.Throughput: Measure the number of successful requests processed by the API per unit of time. Throughput metrics provide insights into the API’s capacity to handle incoming traffic.Now that you have some basic metrics to collect from your core architectural components and can monitor the overall health of an application, we need to monitor the actual DAGs themselves to ensure that they function as intended.Monitoring your DAGsThere are multiple aspects to monitoring your DAGs, and while they’re all valuable, they may not all be necessary. Take care to ensure that your monitoring and alerting stack match your organizational needs with regard to operational parameters for resiliency and, if there is a failure, recovery times. No matter how much or how little you choose to implement, knowing that your DAGs work and if and how they fail is the first step in fixing problems that will arise.LoggingAirflow writes logs for tasks in a hierarchical structure that allows you to see each task’s logs in the Airflow UI. The community also provides a number of providers to utilize other services for backing log storage and retrieval. A complete list of supported providers is available at https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/logging.html.Airflow uses the standard Python logging framework to write logs. If you’re writing custom operators or executing Python functions with a PythonOperator, just make sure that you instantiate a Python logger instance, and then the associated methods will handle everything for you.AlertingAirflow provides mechanisms for alerting on operational aspects of your executing workloads that can be configured within your DAG:Email notifications: Email notifications can be sent if a task is put into a marked or retry state with the `email_on_failure` or `email_on_retry` state, respectively. These arguments can be provided to all tasks in the DAG with the `default_args` key work in the DAG, or individual tasks by setting the keyword argument individually.Callbacks: Callbacks are special actions that are executed if a specific state change occurs. Generally, these callbacks should be thoughtfully leveraged to send alerts that are critical operationally:on_success_callback: This callback will be executed at both the task and DAG levels when entering a successful state. Unless it is critical that you know whether something succeeds, we generally suggest not using this for alerting.on_failure_callback: This callback is invoked when a task enters a failed state. Generally, this callback should always be set and, in critical scenarios, alert on failures that require intervention and support.on_execute_callback: This is invoked right before a task executes and only exists at the task level. Use sparingly for alerting, as it can quickly become a noisy alert when overused.on_retry_callback: This is invoked when a task is placed in a retry state. This is another callback to be cautious about as an alert, as it can become noisy and cause false alarms.sla_miss_callback: This is invoked when a DAG misses its defined SLA. This callback is only executed at the end of a DAG’s execution cycle so tends to be a very reactive notification that something has gone wrong.SLA monitoringAs awesome of a tool as Airflow is, it is a well-known fact in the community that SLAs, while largely functional, have some unfortunate details with regard to implementation that can make them problematic at best, and they are generally regarded as a broken feature in Airflow. We suggest that if you require SLA monitoring on your workflows, you deploy a CRON job monitoring tool such as healthchecks (https://github.com/healthchecks/healthchecks) that allows you to create suppressive alerts for your services through its rest API to manage SLAs. By pairing this third- party service with either HTTP operators or simple requests from callbacks, you can ensure that your most critical workflows achieve dynamic and resilient SLA alerting.Performance profilingThe Airflow UI is a great tool for profiling the performance of individual DAGs:The Gannt chart view: This is a great visualization for understanding the amount of time spent on individual tasks and the relative order of execution. If you’re worried about bottlenecks in your workflow, start here.Task duration: This allows you to profile the run characteristics of tasks within your DAG over a historical period. This tool is great at helping you understand temporal patterns in execution time and finding outliers in execution. Especially if you find that a DAG slows down over time, this view can help you understand whether it is a systemic issue and which tasks might need additional development.Landing times: This shows the delta between task completion and the start of the DAG run. This is an un-intuitive but powerful metric, as increases in it, when paired with stable task durations in upstream tasks, can help identify whether a scheduler is under heavy load and may need tuning.Additional metrics that have proven to be useful (but may need to be calculated) include the following:Task startup time: This is an especially useful metric when operating with a Kubernetes executor. To calculate this, you will need to calculate the difference between `start_date` and `execution_date` on each task instance. This metric will especially help you identify bottlenecks outside of Airflow that may impact task run times.Task failure and retry counts: Monitoring the frequency of task failures and retries can help identify information about the stability and robustness of your environment. Especially if these types of failure can be linked back to patterns in time or execution, it can help debug interactions with other services.DAG parsing time: Monitoring the amount of time a DAG takes to parse is very important to understand scheduler load and bottlenecks. If an individual DAG takes a long time to load (either due to heavy imports or long blocking calls being executed during parsing), it can have a material impact on the timeliness of scheduling tasks.ConclusionIn this article, we covered some essential strategies to effectively monitor both the core Airflow system and individual DAGs post-deployment. We highlighted the importance of active and suppressive monitoring techniques and provided insights into the critical metrics to track for each component, including the scheduler, metadata database, triggerer, executors/workers, and web server. Additionally, we discussed logging, alerting mechanisms, SLA monitoring, and performance profiling techniques to ensure the reliability, scalability, and efficiency of Airflow workflows. By implementing these monitoring practices and leveraging the insights gained, operators can proactively manage and optimize their Airflow deployments for optimal performance and reliability.Author BioDylan Intorf is a solutions architect and data engineer with a BS from Arizona State University in Computer Science. He has 10+ years of experience in the software and data engineering space, delivering custom tailored solutions to Tech, Financial, and Insurance industries.Kendrick van Doorn is an engineering and business leader with a background in software development, with over 10 years of developing tech and data strategies at Fortune 100 companies. In his spare time, he enjoys taking classes at different universities and is currently an MBA candidate at Columbia University.Dylan Storey has a B.Sc. and M.Sc. from California State University, Fresno in Biology and a Ph.D. from University of Tennessee, Knoxville in Life Sciences where he leveraged computational methods to study a variety of biological systems. He has over 15 years of experience in building, growing, and leading teams; solving problems in developing and operating data products at a variety of scales and industries.

2
0
62994

article-image-mastering-promql-a-comprehensive-guide-to-prometheus-query-language

Rob Chapman, Peter Holmes

07 Nov 2024

15 min read

Mastering PromQL: A Comprehensive Guide to Prometheus Query Language

Rob Chapman, Peter Holmes

07 Nov 2024

15 min read

This article is an excerpt from the book, "Observability with Grafana", by Rob Chapman, Peter Holmes. This book provides a holistic understanding of observability concepts using the Grafana Labs tools, teaching you how to fully leverage the LGTM stack.Introduction PromQL, or Prometheus Query Language, is a powerful tool designed to work with Prometheus, an open-source systems monitoring and alerting toolkit. Initially developed by SoundCloud in 2012 and later accepted by the Cloud Native Computing Foundation in 2016, Prometheus has become a crucial component of modern infrastructure monitoring. PromQL allows users to query data stored in Prometheus, enabling the creation of insightful dashboards and setting up alerts based on the performance metrics of applications and systems. This article will explore the core functionalities of PromQL, including how it interacts with metrics data and how it can be used to effectively monitor and analyze system performance. Introducing PromQL Prometheus was initially developed by SoundCloud in 2012; the project was accepted by the Cloud Native Computing Foundation in 2016 as the second incubated project (after Kubernetes), and version 1.0 was released shortly after. PromQL is an integral part of Prometheus, which is used to query stored data and produce dashboards and alerts. Before we delve into the details of the language, let’s briefly look at the following ways in which Prometheus-compatible systems interact with metrics data: Ingesting metrics: Prometheus-compatible systems accept a timestamp, key-value labels, and a sample value. As the details of the Prometheus Time Series Database (TSDB) are quite complicated, the following diagram shows a simplified example of how an individual sample for a metric is stored once it has been ingested: Figure 5.1 – A simplified view of metric data stored in the TSDB The labels or dimensions of a metric: Prometheus labels provide metadata to identify data of interest. These labels create metrics, time series, and samples: * Each unique __name__ value creates a metric. In the preceding figure, the metric is app_ frontend_requests. * Each unique set of labels creates a time series. In the preceding figure, the set of all labels is the time series. * A time series will contain multiple samples, each with a unique timestamp. The preceding figure shows a single sample, but over time, multiple samples will be collected for each time series. * The number of unique values for a metric label is referred to as the cardinality of the l abel. Highly cardinal labels should be avoided, as they signifi cantly increase the storage costs of the metric. The following diagram shows a single metric containing two time series and five samples: Figure 5.2 – An example of samples from multiple time series In Grafana, we can see a representation of the time series and samples from a metric. To do this, follow these steps: 1. In your Grafana instance, select Explore in the menu. 2. Choose your Prometheus data source, which will be labeled as grafanacloud-<team>prom (default). 3. In the Metric dropdown, choose app_frontend_requests_total, and under Options, set Format to Table, and then click on Run query. Th is will show you all the samples and time series in the metric over the selected time range. You should see data like this: Figure 5.3 – Visualizing the samples and time series that make up a metric Now that we understand the data structure, let’s explore PromQL. An overview of PromQL features In this section, we will take you through the features that PromQL has. We will start with an explanation of the data types, and then we will look at how to select data, how to work on multiple datasets, and how to use functions. As PromQL is a query language, it’s important to know how to manipulate data to produce alerts and dashboards. Data types PromQL offers three data types, which are important, as the functions and operators in PromQL will work diff erently depending on the data types presented: Instant vectors are a data type that stores a set of time series containing a single sample, all sharing the same timestamp – that is, it presents values at a specifi c instant in time: Figure 5.4 – An instant vector Range vectors store a set of time series, each containing a range of samples with different timestamps: Figure 5.5 – Range vectors Scalars are simple numeric values, with no labels or timestamps involved. Selecting data PromQL offers several tools for you to select data to show in a dashboard or a list, or just to understand a system’s state. Some of these are described in the following table: Table 5.1 – The selection operators available in PromQL In addition to the operators that allow us to select data, PromQL offers a selection of operators to compare multiple sets of data. Operators between two datasets Some data is easily provided by a single metric, while other useful information needs to be created from multiple metrics. The following operators allow you to combine datasets. Table 5.2 – The comparison operators available in PromQL Vector matching is an initially confusing topic; to clarify it, let’s consider examples for the three cases of vector matching – one-to-one, one-to-many/many-to-one, and many-to-many. By default, when combining vectors, all label names and values are matched. This means that for each element of the vector, the operator will try to find a single matching element from the second vector. Let’s consider a simple example: Vector A: 10{color=blue,smell=ocean} 31{color=red,smell=cinnamon} 27{color=green,smell=grass} Vector B: 19{color=blue,smell=ocean} 8{color=red,smell=cinnamon} 14{color=green,smell=jungle} A{} + B{}: 29{color=blue,smell=ocean} 39 {color=red,smell=cinnamon} A{} + on (color) B{} or A{} + ignoring (smell) B{}: 29{color=blue} 39{color=red} 41{color=green} When color=blue and smell=ocean, A{} + B{} gives 10 + 19 = 29, and when color=red and smell=cinnamon, A{} + B{} gives 31 + 8 = 29. The other elements do not match the two vectors so are ignored. When we sum the vectors using on (color), we will only match on the color label; so now, the two green elements match and are summed. This example works when there is a one-to-one relationship of labels between vector A and vector B. However, sometimes there may be a many-to-one or one-to-many relationship – that is, vector A or vector B may have more than one element that matches the other vector. In these cases, Prometheus will give an error, and grouping syntax must be used. Let’s look at another example to illustrate this: Vector A: 7{color=blue,smell=ocean} 5{color=red,smell=cinamon} 2{color=blue,smell=powder} Vector B: 20{color=blue,smell=ocean} 8{color=red,smell=cinamon} 14{color=green,smell=jungle} A{} + on (color) group_left B{}: 27{color=blue,smell=ocean} 13{color=red,smell=cinamon} 22{color=blue,smell=powder} Now, we have two different elements in vector A with color=blue. The group_left command will use the labels from vector A but only match on color. This leads to the third element of the combined vector having a value of 22, when the item matching in vector B has a different smell. The group_right operator will behave in the opposite direction. The final option is a many-to-many vector match. These matches use the logical operators and, unless, and or to combine parts of vectors A and B. Let’s see some examples: Vector A: 10{color=blue,smell=ocean} 31{color=red,smell=cinamon} 27{color=green,smell=grass} Vector B: 19{color=blue,smell=ocean} 8{color=red,smell=cinamon} 14{color=green,smell=jungle} A{} and B{}: 10{color=blue,smell=ocean} 31{color=red,smell=cinamon} A{} unless B{}: 27{color=green,smell=grass} A{} or B{}: 10{color=blue,smell=ocean} 31{color=red,smell=cinamon} 27{color=green,smell=grass} 14{color=green,smell=jungle} Unlike the previous examples, mathematical operators are not being used here, so the values of the elements are the values from vector A, but only the elements of A that match the logical condition in B are returned. ConclusionPromQL is an essential component of Prometheus, offering users a flexible and powerful means of querying and analyzing time-series data. By understanding its data types and operators, users can craft complex queries that provide deep insights into system performance. The language supports a variety of data selection and comparison operations, allowing for precise monitoring and alerting. Whether working with instant vectors, range vectors, or scalars, PromQL enables developers and operators to optimize their use of Prometheus for monitoring and alerting, ensuring systems remain performant and reliable. As organizations continue to embrace cloud-native architectures, mastering PromQL becomes increasingly vital for maintaining robust and efficient systems. Author BioRob Chapman is a creative IT engineer and founder at The Melt Cafe, with two decades of experience in the full application life cycle. Working over the years for companies such as the Environment Agency, BT Global Services, Microsoft, and Grafana, Rob has built a wealth of experience on large complex systems. More than anything, Rob loves saving energy, time, and money and has a track record for bringing production-related concerns forward so that they are addressed earlier in the development cycle, when they are cheaper and easier to solve. In his spare time, Rob is a Scout leader, and he enjoys hiking, climbing, and, most of all, spending time with his family and six children.Peter Holmes is a senior engineer with a deep interest in digital systems and how to use them to solve problems. With over 16 years of experience, he has worked in various roles in operations. Working at organizations such as Boots UK, Fujitsu Services, Anaplan, Thomson Reuters, and the NHS, he has experience in complex transformational projects, site reliability engineering, platform engineering, and leadership. Peter has a history of taking time to understand the customer and ensuring Day-2+ operations are as smooth and cost-effective as possible.

0
0
78790

article-image-mastering-the-api-life-cycle-a-comprehensive-guide-to-design-implementation-release-and-maintenance

Bruno Pedro

06 Nov 2024

15 min read

Mastering the API Life Cycle: A Comprehensive Guide to Design, Implementation, Release, and Maintenance

Bruno Pedro

06 Nov 2024

15 min read

2
0
87683

article-image-automating-ocr-and-translation-with-google-cloud-functions-a-step-by-step-guide

Agnieszka Koziorowska, Wojciech Marusiak

05 Nov 2024

15 min read

Automating OCR and Translation with Google Cloud Functions: A Step-by-Step Guide

Agnieszka Koziorowska, Wojciech Marusiak

05 Nov 2024

15 min read

This article is an excerpt from the book, "Google Cloud Associate Cloud Engineer Certification and Implementation Guide", by Agnieszka Koziorowska, Wojciech Marusiak. This book serves as a guide for students preparing for ACE certification, offering invaluable practical knowledge and hands-on experience in implementing various Google Cloud Platform services. By actively engaging with the content, you’ll gain the confidence and expertise needed to excel in your certification journey.Introduction In this article, we will walk you through an example of implementing Google Cloud Functions for optical character recognition (OCR) on Google Cloud Platform. This tutorial will demonstrate how to automate the process of extracting text from an image, translating the text, and storing the results using Cloud Functions, Pub/Sub, and Cloud Storage. By leveraging Google Cloud Vision and Translation APIs, we can create a workflow that efficiently handles image processing and text translation. The article provides detailed steps to set up and deploy Cloud Functions using Golang, covering everything from creating storage buckets to deploying and running your function to translate text. Google Cloud Functions Example Now that you’ve learned what Cloud Functions is, I’d like to show you how to implement a sample Cloud Function. We will guide you through optical character recognition (OCR) on Google Cloud Platform with Cloud Functions. Our use case is as follows: 1. An image with text is uploaded to Cloud Storage. 2. A triggered Cloud Function utilizes the Google Cloud Vision API to extract the text and identify the source language. 3. The text is queued for translation by publishing a message to a Pub/Sub topic. 4. A Cloud Function employs the Translation API to translate the text and stores the result in the translation queue. 5. Another Cloud Function saves the translated text from the translation queue to Cloud Storage. 6. The translated results are available in Cloud Storage as individual text files for each translation. We need to download the samples first; we will use Golang as the programming language. Source files can be downloaded from – https://github.com/GoogleCloudPlatform/golangsamples. Before working with the OCR function sample, we recommend enabling the Cloud Translation API and the Cloud Vision API. If they are not enabled, your function will throw errors, and the process will not be completed. Let’s start with deploying the function: 1. We need to create a Cloud Storage bucket. Create your own bucket with unique name – please refer to documentation on bucket naming under following link: https://cloud.google.com/storage/docs/buckets We will use the following code: gsutil mb gs://wojciech_image_ocr_bucket 2. We also need to create a second bucket to store the results: gsutil mb gs://wojciech_image_ocr_bucket_results 3. We must create a Pub/Sub topic to publish the finished translation results. We can do so with the following code: gcloud pubsub topics create YOUR_TOPIC_NAME. We used the following command to create it: gcloud pubsub topics create wojciech_translate_topic 4. Creating a second Pub/Sub topic to publish translation results is necessary. We can use the following code to do so: gcloud pubsub topics create wojciech_translate_topic_results 5. Next, we will clone the Google Cloud GitHub repository with some Python sample code: git clone https://github.com/GoogleCloudPlatform/golang-samples 6. From the repository, we need to go to the golang-samples/functions/ocr/app/ file to be able to deploy the desired Cloud Function. 7. We recommend reviewing the included go files to review the code and understand it in more detail. Please change the values of your storage buckets and Pub/Sub topic names. 8. We will deploy the first function to process images. We will use the following command: gcloud functions deploy ocr-extract-go --runtime go119 --trigger-bucket wojciech_image_ocr_bucket --entry-point ProcessImage --set-env-vars "^:^GCP_PROJECT=wmarusiak-book- 351718:TRANSLATE_TOPIC=wojciech_translate_topic:RESULT_ TOPIC=wojciech_translate_topic_results:TO_LANG=es,en,fr,ja" 9. After deploying the first Cloud Function, we must deploy the second one to translate the text. We can use the following code snippet: gcloud functions deploy ocr-translate-go --runtime go119 --trigger-topic wojciech_translate_topic --entry-point TranslateText --set-env-vars "GCP_PROJECT=wmarusiak-book- 351718,RESULT_TOPIC=wojciech_translate_topic_results" 10. The last part of the complete solution is a third Cloud Function that saves results to Cloud Storage. We will use the following snippet of code to do so: gcloud functions deploy ocr-save-go --runtime go119 --triggertopic wojciech_translate_topic_results --entry-point SaveResult --set-env-vars "GCP_PROJECT=wmarusiak-book-351718,RESULT_ BUCKET=wojciech_image_ocr_bucket_results" 11. We are now free to upload any image containing text. It will be processed first, then translated and saved into our Cloud Storage bucket. 12. We uploaded four sample images that we downloaded from the Internet that contain some text. We can see many entries in the ocr-extract-go Cloud Function’s logs. Some Cloud Function log entries show us the detected language in the image and the other extracted text: Figure 7.22 – Cloud Function logs from the ocr-extract-go function 13. ocr-translate-go translates detected text in the previous function: Figure 7.23 – Cloud Function logs from the ocr-translate-go function 14. Finally, ocr-save-go saves the translated text into the Cloud Storage bucket: Figure 7.24 – Cloud Function logs from the ocr-save-go function 15. If we go to the Cloud Storage bucket, we’ll see the saved translated files: Figure 7.25 – Translated images saved in the Cloud Storage bucket 16. We can view the content directly from the Cloud Storage bucket by clicking Download next to the file, as shown in the following screenshot: Figure 7.26 – Translated text from Polish to English stored in the Cloud Storage bucket Cloud Functions is a powerful and fast way to code, deploy, and use advanced features. We encourage you to try out and deploy Cloud Functions to understand the process of using them better. At the time of writing, Google Cloud Free Tier offers a generous number of free resources we can use. Cloud Functions offers the following with its free tier: 2 million invocations per month (this includes both background and HTTP invocations) 400,000 GB-seconds, 200,000 GHz-seconds of compute time 5 GB network egress per month Google Cloud has comprehensive tutorials that you can try to deploy. Go to https://cloud.google.com/functions/docs/tutorials to follow one. Conclusion In conclusion, Google Cloud Functions offer a powerful and scalable solution for automating tasks like optical character recognition and translation. Through this example, we have demonstrated how to use Cloud Functions, Pub/Sub, and the Google Cloud Vision and Translation APIs to build an end-to-end OCR and translation pipeline. By following the provided steps and code snippets, you can easily replicate this process for your own use cases. Google Cloud's generous Free Tier resources make it accessible to get started with Cloud Functions. We encourage you to explore more by deploying your own Cloud Functions and leveraging the full potential of Google Cloud Platform for serverless computing. Author BioAgnieszka is an experienced Systems Engineer who has been in the IT industry for 15 years. She is dedicated to supporting enterprise customers in the EMEA region with their transition to the cloud and hybrid cloud infrastructure by designing and architecting solutions that meet both business and technical requirements. Agnieszka is highly skilled in AWS, Google Cloud, and VMware solutions and holds certifications as a specialist in all three platforms. She strongly believes in the importance of knowledge sharing and learning from others to keep up with the ever-changing IT industry.With over 16 years in the IT industry, Wojciech is a seasoned and innovative IT professional with a proven track record of success. Leveraging extensive work experience in large and complex enterprise environments, Wojciech brings valuable knowledge to help customers and businesses achieve their goals with precision, professionalism, and cost-effectiveness. Holding leading certifications from AWS, Alibaba Cloud, Google Cloud, VMware, and Microsoft, Wojciech is dedicated to continuous learning and sharing knowledge, staying abreast of the latest industry trends and developments.

0
0
70252

article-image-vertex-ai-workbench-your-complete-guide-to-scaling-machine-learning-with-google-cloud

Jasmeet Bhatia, Kartik Chaudhary

04 Nov 2024

15 min read

Vertex AI Workbench: Your Complete Guide to Scaling Machine Learning with Google Cloud

Jasmeet Bhatia, Kartik Chaudhary

04 Nov 2024

15 min read

1
1
67379

article-image-essential-sql-for-data-engineers

Kedeisha Bryan, Taamir Ransome

31 Oct 2024

10 min read

Essential SQL for Data Engineers

Kedeisha Bryan, Taamir Ransome

31 Oct 2024

10 min read

This article is an excerpt from the book, Cracking the Data Engineering Interview, by Kedeisha Bryan, Taamir Ransome. The book is a practical guide that’ll help you prepare to successfully break into the data engineering role. The chapters cover technical concepts as well as tips for resume, portfolio, and brand building to catch the employer's attention, while also focusing on case studies and real-world interview questions.Introduction In the world of data engineering, SQL is the unsung hero that empowers us to store, manipulate, transform, and migrate data easily. It is the language that enables data engineers to communicate with databases, extract valuable insights, and shape data to meet their needs. Regardless of the nature of the organization or the data infrastructure in use, a data engineer will invariably need to use SQL for creating, querying, updating, and managing databases. As such, proficiency in SQL can often the difference between a good data engineer and a great one. Whether you are new to SQL or looking to brush up your skills, this chapter will serve as a comprehensive guide. By the end of this chapter, you will have a solid understanding of SQL as a data engineer and be prepared to showcase your knowledge and skills in an interview setting. In this article, we will cover the following topics: Must-know foundational SQL concepts Must-know advanced SQL concepts Technical interview questions Must-know foundational SQL concepts In this section, we will delve into the foundational SQL concepts that form the building blocks of data engineering. Mastering these fundamental concepts is crucial for acing SQL-related interviews and effectively working with databases. Let’s explore the critical foundational SQL concepts every data engineer should be comfortable with, as follows: SQL syntax: SQL syntax is the set of rules governing how SQL statements should be written. As a data engineer, understanding SQL syntax is fundamental because you’ll be writing and reviewing SQL queries regularly. These queries enable you to extract, manipulate, and analyze data stored in relational databases. SQL order of operations: The order of operations dictates the sequence in which each of the following operators is executed in a query: FROM and JOIN WHERE GROUP BY HAVING SELECT DISTINCT ORDER BY LIMIT/OFFSET Data types: SQL supports a variety of data types, such as INT, VARCHAR, DATE, and so on. Understanding these types is crucial because they determine the kind of data that can be stored in a column, impacting storage considerations, query performance, and data integrity. As a data engineer, you might also need to convert data types or handle mismatches. SQL operators: SQL operators are used to perform operations on data. They include arithmetic operators (+, -, *, /), comparison operators (>, <, =, and so on), and logical operators (AND, OR, and NOT). Knowing these operators helps you construct complex queries to solve intricate data-related problems. Data Manipulation Language (DML), Data Definition Language (DDL), and Data Control Language (DCL) commands: DML commands such as SELECT, INSERT, UPDATE, and DELETE allow you to manipulate data stored in the database. DDL commands such as CREATE, ALTER, and DROP enable you to manage database schemas. DCL commands such as GRANT and REVOKE are used for managing permissions. As a data engineer, you will frequently use these commands to interact with databases. Basic queries: Writing queries to select, filter, sort, and join data is an essential skill for any data engineer. These operations form the basis of data extraction and manipulation. Aggregation functions: Functions such as COUNT, SUM, AVG, MAX, MIN, and GROUP BY are used to perform calculations on multiple rows of data. They are essential for generating reports and deriving statistical insights, which are critical aspects of a data engineer’s role. The following section will dive deeper into must-know advanced SQL concepts, exploring advanced techniques to elevate your SQL proficiency. Get ready to level up your SQL game and unlock new possibilities in data engineering! Must-know advanced SQL concepts This section will explore advanced SQL concepts that will elevate your data engineering skills to the next level. These concepts will empower you to tackle complex data analysis, perform advanced data transformations, and optimize your SQL queries. Let’s delve into must-know advanced SQL concepts, as follows: Window functions: These do a calculation on a group of rows that are related to the current row. They are needed for more complex analyses, such as figuring out running totals or moving averages, which are common tasks in data engineering. Subqueries: Queries nested within other queries. They provide a powerful way to perform complex data extraction, transformation, and analysis, often making your code more efficient and readable. Common Table Expressions (CTEs): CTEs can simplify complex queries and make your code more maintainable. They are also essential for recursive queries, which are sometimes necessary for problems involving hierarchical data. Stored procedures and triggers: Stored procedures help encapsulate frequently performed tasks, improving efficiency and maintainability. Triggers can automate certain operations, improving data integrity. Both are important tools in a data engineer’s toolkit. Indexes and optimization: Indexes speed up query performance by enabling the database to locate data more quickly. Understanding how and when to use indexes is key for a data engineer, as it affects the efficiency and speed of data retrieval. Views: Views simplify access to data by encapsulating complex queries. They can also enhance security by restricting access to certain columns. As a data engineer, you’ll create and manage views to facilitate data access and manipulation. By mastering these advanced SQL concepts, you will have the tools and knowledge to handle complex data scenarios, optimize your SQL queries, and derive meaningful insights from your datasets. The following section will prepare you for technical interview questions on SQL. We will equip you with example answers and strategies to excel in SQL-related interview discussions. Let’s further enhance your SQL expertise and be well prepared for the next phase of your data engineering journey. Technical interview questions This section will address technical interview questions specifically focused on SQL for data engineers. These questions will help you demonstrate your SQL proficiency and problem-solving abilities. Let’s explore a combination of primary and advanced SQL interview questions and the best methods to approach and answer them, as follows: Question 1: What is the difference between the WHERE and HAVING clauses? Answer: The WHERE clause filters data based on conditions applied to individual rows, while the HAVING clause filters data based on grouped results. Use WHERE for filtering before aggregating data and HAVING for filtering after aggregating data. Question 2: How do you eliminate duplicate records from a result set? Answer: Use the DISTINCT keyword in the SELECT statement to eliminate duplicate records and retrieve unique values from a column or combination of columns. Question 3: What are primary keys and foreign keys in SQL? Answer: A primary key uniquely identifies each record in a table and ensures data integrity. A foreign key establishes a link between two tables, referencing the primary key of another table to enforce referential integrity and maintain relationships. Question 4: How can you sort data in SQL? Answer: Use the ORDER BY clause in a SELECT statement to sort data based on one or more columns. The ASC (ascending) keyword sorts data in ascending order, while the DESC (descending) keyword sorts it in descending order. Question 5: Explain the difference between UNION and UNION ALL in SQL. Answer: UNION combines and removes duplicate records from the result set, while UNION ALL combines all records without eliminating duplicates. UNION ALL is faster than UNION because it does not involve the duplicate elimination process. Question 6: Can you explain what a self join is in SQL? Answer: A self join is a regular join where a table is joined to itself. This is often useful when the data is related within the same table. To perform a self join, we have to use table aliases to help SQL distinguish the left from the right table. Question 7: How do you optimize a slow-performing SQL query? Answer: Analyze the query execution plan, identify bottlenecks, and consider strategies such as creating appropriate indexes, rewriting the query, or using query optimization techniques such as JOIN order optimization or subquery optimization. Question 8: What are CTEs, and how do you use them? Answer: CTEs are temporarily named result sets that can be referenced within a query. They enhance query readability, simplify complex queries, and enable recursive queries. Use the WITH keyword to define CTEs in SQL. Question 9: Explain the ACID properties in the context of SQL databases. Answer: ACID is an acronym that stands for Atomicity, Consistency, Isolation, and Durability. These are basic properties that make sure database operations are reliable and transactional. Atomicity makes sure that a transaction is handled as a single unit, whether it is fully done or not. Consistency makes sure that a transaction moves the database from one valid state to another. Isolation makes sure that transactions that are happening at the same time don’t mess with each other. Durability makes sure that once a transaction is committed, its changes are permanent and can survive system failures. Question 10: How can you handle NULL values in SQL? Answer: Use the IS NULL or IS NOT NULL operator to check for NULL values. Additionally, you can use the COALESCE function to replace NULL values with alternative non-null values. Question 11: What is the purpose of stored procedures and functions in SQL? Answer: Stored procedures and functions are reusable pieces of SQL code encapsulating a set of SQL statements. They promote code modularity, improve performance, enhance security, and simplify database maintenance. Question 12: Explain the difference between a clustered and a non-clustered index. Answer: The physical order of the data in a table is set by a clustered index. This means that a table can only have one clustered index. The data rows of a table are stored in the leaf nodes of a clustered index. A non-clustered index, on the other hand, doesn’t change the order of the data in the table. After sorting the pointers, it keeps a separate object in a table that points back to the original table rows. There can be more than one non-clustered index for a table. Prepare for these interview questions by understanding the underlying concepts, practicing SQL queries, and being able to explain your answers. ConclusionThis article explored the foundational and advanced principles of SQL that empower data engineers to store, manipulate, transform, and migrate data confidently. Understanding these concepts has unlocked the door to seamless data operations, optimized query performance, and insightful data analysis. SQL is the language that bridges the gap between raw data and valuable insights. With a solid grasp of SQL, you possess the skills to navigate databases, write powerful queries, and design efficient data models. Whether preparing for interviews or tackling real-world data engineering challenges, the knowledge you have gained in this chapter will propel you toward success. Remember to continue exploring and honing your SQL skills. Stay updated with emerging SQL technologies, best practices, and optimization techniques to stay at the forefront of the ever-evolving data engineering landscape. Embrace the power of SQL as a critical tool in your data engineering arsenal, and let it empower you to unlock the full potential of your data. Author BioKedeisha Bryan is a data professional with experience in data analytics, science, and engineering. She has prior experience combining both Six Sigma and analytics to provide data solutions that have impacted policy changes and leadership decisions. She is fluent in tools such as SQL, Python, and Tableau.She is the founder and leader at the Data in Motion Academy, providing personalized skill development, resources, and training at scale to aspiring data professionals across the globe. Her other works include another Packt book in the works and an SQL course for LinkedIn Learning.Taamir Ransome is a Data Scientist and Software Engineer. He has experience in building machine learning and artificial intelligence solutions for the US Army. He is also the founder of the Vet Dev Institute, where he currently provides cloud-based data solutions for clients. He holds a master's degree in Analytics from Western Governors University.

3
0
44542

article-image-mastering-prometheus-sharding-boost-scalability-with-efficient-data-management

William Hegedus

28 Oct 2024

15 min read

Mastering Prometheus Sharding: Boost Scalability with Efficient Data Management

William Hegedus

28 Oct 2024

15 min read

0
0
32636

article-image-how-to-land-music-placements-and-chart-on-billboard

Chris Noxx

28 Oct 2024

10 min read

How to Land Music Placements and Chart on Billboard

Chris Noxx

28 Oct 2024

10 min read

This article is an excerpt from the book, A Power User's Guide to FL Studio 21, by Chris Noxx. Get a chance to learn from an FL Studio Power User to take your music productions to the next level using time-tested and decade-mastered production techniques. This book will uncover techniques for creating music in FL Studio and best approaches to making your way to Billboard charts.Introduction This broad article captures the essence of “Chapter 8 - How to Get Records Placed So They Land on Billboard Charts” of “A Power User's Guide to FL Studio 21” book written by Chris Noxx, covering key areas such as placements, catalog building, rights, outreach, and types of deals, all while remaining true to the original content In the highly competitive music industry, getting records placed with major artists and landing on the Billboard charts is a dream for many producers. This chapter provides a comprehensive guide on how to achieve that dream by focusing on placements, catalog building, networking, and understanding the business deals that will help propel your career to new heights. The Importance of the Journey Before diving into the technicalities, it’s essential to recognize that the path to success is not straightforward. Embracing the journey, with all its ups and downs, is crucial for maintaining the dedication and perseverance needed to make it in the music industry. What Are Record Placements? Record placements are one of the most coveted opportunities in the music world. They involve getting your music featured on a major artist's album, single, or other releases. In addition to record placements, sync placements offer another avenue for revenue, with your music being used in television shows, commercials, films, or video games. Both types of placements provide significant exposure and the potential for substantial income. Building and Valuing Your Music Catalog Your catalog of music is a valuable asset. Each track you create holds the potential for future placements, and over time, your catalog can generate consistent revenue. It's essential to recognize that catalogs have become an alternative asset class. They can be bought, sold, or licensed, much like real estate or stocks, making them a crucial long-term investment for your career. The value of a catalog is determined by various factors, including past performance, future sync potential, and overall demand. Even older tracks can increase in value when they find the right placement. Understanding Rights and Income Streams In the music business, revenue streams are primarily derived from two sides of the copyright: the publishing side and the master side. Publishing side: This covers the composition, including melodies, chords, and lyrics. Master side: This pertains to the actual sound recording. Maximizing income from placements means retaining as much ownership as possible on both sides of the copyright. Having a solid understanding of these rights ensures you're protecting your work and maximizing your earnings. Taking Action: The Key to Success Opportunities rarely come to those who wait, which is why taking action is critical. Whether through networking, outreach, or consistent improvement of your music, positioning yourself in the right places at the right times is vital to your success. Building relationships with key players in the industry—artists, managers, A&Rs—is a fundamental step toward getting your music in front of the right people. Attend industry events, create meaningful connections, and ensure you're continuously improving your craft to stand out in a crowded field. Two Primary Approaches to Securing Placements There are two main strategies for landing placements: Direct action: Actively pursuing placements by reaching out to artists, managers, and A&Rs. Indirect action: Building your brand and reputation through content creation and networking, allowing opportunities to come to you over time. Both approaches are essential and should be used together to maximize your chances of success. Consistent effort in both areas will yield the best results. The Power of Cold Outreach Cold outreach is a powerful, albeit often underutilized, tool in the industry. By reaching out to artists, managers, or other key players, you can introduce your work and potentially land a placement. Personalizing your outreach and demonstrating the value you bring to their projects will increase your chances of getting a response. Building a Strong Lead List Having a well-targeted lead list is crucial for cold outreach. Your list should include relevant artists, A&Rs, managers, and industry professionals who are likely to benefit from your music. The more focused your list, the better your chances of success when conducting outreach. Types of Industry Deals Understanding the types of deals available in the industry is essential for protecting your interests and maximizing your earnings. Here are some common deals that producers may encounter: Co-publishing deals: Where you split ownership of the publishing rights with a publisher. Administration deals: You maintain ownership of your rights but pay a third party to manage and administer them. Traditional publishing deals: You assign your publishing rights to a company that manages your catalog in exchange for an upfront payment and royalties. Self-publishing: You retain full control of your rights but take on the responsibilities of managing and administering them. Production deals: Where you provide services to artists in exchange for a portion of the income generated by their music. Management deals: An agreement where a manager oversees your career and takes a percentage of your earnings. Label deals: Contracts with record labels to distribute and promote your music. Joint venture deals: Partnerships with labels or other companies to jointly promote and distribute your music. Distribution deals: Agreements to distribute your music through a specific platform or company. Each type of deal offers different benefits and trade-offs, and understanding which one best suits your goals will help you navigate the business side of the music industry. Steps for Building a Successful Career Success in the music industry doesn’t happen overnight. It requires dedication, persistence, and the willingness to take deliberate action. By following these steps—building a strong catalog, mastering the business aspects of music, and positioning yourself effectively—you can increase your chances of landing major placements and seeing your records rise on the Billboard charts. Conclusion Achieving success in the music industry, particularly landing records on the Billboard charts, requires more than just talent; it demands strategic planning, consistent action, and a deep understanding of the business side of music. From building a valuable catalog of songs to mastering the intricacies of publishing rights and making the most of both direct and indirect outreach, every step plays a vital role in your journey. By positioning yourself in the right places, embracing opportunities through cold outreach, and networking with key industry players, you increase your chances of getting your music placed with major artists and securing lucrative sync placements. Understanding the various types of deals, from co-publishing to label agreements, further empowers you to protect your work and maximize your earnings. At the heart of it all is the drive to continuously improve and take action. The music industry is competitive, but by combining creative mastery with smart business moves, you can create lasting success and potentially see your records climb the Billboard charts. Your journey is as much about persistence as it is about creativity—embrace both to unlock your full potential. Author BioChris Noxx is a FL Studio Power User and JUNO nominated (2020 Rap Recording of the Year) producer, composer, and arranger, who has charted on Billboard Charts over 12 times in the US and Canada, and has worked with some of the most iconic hip hop artists of all time using FL Studio (including Dr Dre, Chuck D (Public Enemy), KRS 1, RBX, The Outlawz, Nate Dogg, DJ Quik, Bone Thugs & Harmony, Kurupt, B Real (Cypress Hill), Tory Lanez, Classified, Crooked I, Faith Evans, Troy Ave, Ras Kass, Bishop Lamont, Seether, Talib Kweli, Xzibit, Waka Flocka Flame, Lloyd Banks & Young Buck (G-Unit).

0
0
12623

article-image-mastering-code-generation-exploring-the-llvm-backend

Kai Nacke, Amy Kwan

24 Oct 2024

10 min read

Mastering Code Generation: Exploring the LLVM Backend

Kai Nacke, Amy Kwan

24 Oct 2024

10 min read

This article is an excerpt from the book, Learn LLVM 17 - Second Edition, by Kai Nacke, Amy Kwan. Learn how to build your own compiler, from reading the source to emitting optimized machine code. This book guides you through the JIT compilation framework, extending LLVM in a variety of ways, and using the right tools for troubleshooting.Introduction Generating optimized machine code is a critical task in the compilation process, and the LLVM backend plays a pivotal role in this transformation. The backend translates the LLVM Intermediate Representation (IR), derived from the Abstract Syntax Tree (AST), into machine code that can be executed on target architectures. Understanding how to generate this IR effectively is essential for leveraging LLVM's capabilities. This article delves into the intricacies of generating LLVM IR using a simple expression language example. We'll explore the necessary steps, from declaring library functions to implementing a code generation visitor, ensuring a comprehensive understanding of the LLVM backend's functionality. Generating code with the LLVM backend The task of the backend is to create optimized machine code from the LLVM IR of a module. The IR is the interface to the backend and can be created using a C++ interface or in textual form. Again, the IR is generated from the AST. Textual representation of LLVM IR Before trying to generate the LLVM IR, it should be clear what we want to generate. For our example expression language, the high-level plan is as follows: 1. Ask the user for the value of each variable. 2. Calculate the value of the expression. 3. Print the result. To ask the user to provide a value for a variable and to print the result, two library functions are used: calc_read() and calc_write(). For the with a: 3*a expression, the generated IR is as follows: 1. The library functions must be declared, like in C. The syntax also resembles C. The type before the function name is the return type. The type names surrounded by parenthesis are the argument types. The declaration can appear anywhere in the file: declare i32 @calc_read(ptr) declare void @calc_write(i32) 2. The calc_read() function takes the variable name as a parameter. The following construct defines a constant, holding a and the null byte used as a string terminator in C: @a.str = private constant [2 x i8] c"a\00" 3. It follows the main() function. The parameter names are omitted because they are not used. Just as in C, the body of the function is enclosed in braces: define i32 @main(i32, ptr) { 4. Each basic block must have a label. Because this is the first basic block of the function, we name it entry: entry: 5. The calc_read() function is called to read the value for the a variable. The nested getelemenptr instruction performs an index calculation to compute the pointer to the first element of the string constant. The function result is assigned to the unnamed %2 variable. %2 = call i32 @calc_read(ptr @a.str) 6. Next, the variable is multiplied by 3: %3 = mul nsw i32 3, %2 7. The result is printed on the console via a call to the calc_write() function: call void @calc_write(i32 %3) 8. Last, the main() function returns 0 to indicate a successful execution: ret i32 0 } Each value in the LLVM IR is typed, with i32 denoting the 32-bit bit integer type and ptr denoting a pointer. Note: previous versions of LLVM used typed pointers. For example, a pointer to a byte was expressed as i8* in LLVM. Since L LVM 16, opaque pointers are the default. An opaque pointer is just a pointer to memory, without carrying any type information about it. The notation in LLVM IR is ptr.Previous versions of LLVM used typed pointers. For example, a pointer to a byte was expressed as i8* in LLVM. Since L LVM 16, opaque pointers are the default. An opaque pointer is just a pointer to memory, without carrying any type information about it. The notation in LLVM IR is ptr. Since it is now clear what the IR looks like, let’s generate it from the AST. Generating the IR from the AST The interface, provided in the CodeGen.h header file, is very small: #ifndef CODEGEN_H #define CODEGEN_H #include "AST.h" class CodeGen { public: void compile(AST *Tree); }; #endifBecause the AST contains the information, the basic idea is to use a visitor to walk the AST. The CodeGen.cpp file is implemented as follows: 1. The required includes are at the top of the file: #include "CodeGen.h" #include "llvm/ADT/StringMap.h" #include "llvm/IR/IRBuilder.h" #include "llvm/IR/LLVMContext.h" #include "llvm/Support/raw_ostream.h" 2. The namespace of the LLVM libraries is used for name lookups: using namespace llvm; 3. First, some private members are declared in the visitor. Each compilation unit is represented in LLVM by the Module class and the visitor has a pointer to the module called M. For easy IR generation, the Builder (of type IRBuilder<>) is used. LLVM has a class hierarchy to represent types in IR. You can look up the instances for basic types such as i32 from the LLVM context. These basic types are used very often. To avoid repeated lookups, we cache the needed type instances: VoidTy, Int32Ty, PtrTy, and Int32Zero. The V member is the current calculated value, which is updated through the tree traversal. And last, nameMap maps a variable name to the value returned from the calc_read() function: namespace { class ToIRVisitor : public ASTVisitor { Module *M; IRBuilder<> Builder; Type *VoidTy; Type *Int32Ty; PointerType *PtrTy; Constant *Int32Zero; Value *V; StringMap<Value *> nameMap;4. The constructor initializes all members: public: ToIRVisitor(Module *M) : M(M), Builder(M->getContext()) { VoidTy = Type::getVoidTy(M->getContext()); Int32Ty = Type::getInt32Ty(M->getContext()); PtrTy = PointerType::getUnqual(M->getContext()); Int32Zero = ConstantInt::get(Int32Ty, 0, true); }5. For each function, a FunctionType instance must be created. In C++ terminology, this is a function prototype. A function itself is defined with a Function instance. The run() method defines the main() function in the LLVM IR first: void run(AST *Tree) { FunctionType *MainFty = FunctionType::get( Int32Ty, {Int32Ty, PtrTy}, false); Function *MainFn = Function::Create( MainFty, GlobalValue::ExternalLinkage, "main", M); 6. Then we create the BB basic block with the entry label, and attach it to the IR builder: BasicBlock *BB = BasicBlock::Create(M->getContext(), "entry", MainFn); Builder.SetInsertPoint(BB);7. With this preparation done, the tree traversal can begin: Tree->accept(*this); 8. After the tree traversal, the computed value is printed via a call to the calc_write() function. Again, a function prototype (an instance of FunctionType) has to be created. The only parameter is the current value, V: FunctionType *CalcWriteFnTy = FunctionType::get(VoidTy, {Int32Ty}, false); Function *CalcWriteFn = Function::Create( CalcWriteFnTy, GlobalValue::ExternalLinkage, "calc_write", M); Builder.CreateCall(CalcWriteFnTy, CalcWriteFn, {V});9. The generation finishes by returning 0 from the main() function: Builder.CreateRet(Int32Zero); }10. A WithDecl node holds the names of the declared variables. First, we create a function prototype for the calc_read() function: virtual void visit(WithDecl &Node) override { FunctionType *ReadFty = FunctionType::get(Int32Ty, {PtrTy}, false); Function *ReadFn = Function::Create( ReadFty, GlobalValue::ExternalLinkage, "calc_read", M);11. The method loops through the variable names: for (auto I = Node.begin(), E = Node.end(); I != E; ++I) { 12. For each variable, a string with a variable name is created: StringRef Var = *I; Constant *StrText = ConstantDataArray::getString( M->getContext(), Var); GlobalVariable *Str = new GlobalVariable( *M, StrText->getType(), /*isConstant=*/true, GlobalValue::PrivateLinkage, StrText, Twine(Var).concat(".str"));13. Then the IR code to call the calc_read() function is created. The string created in the previous step is passed as a parameter: CallInst *Call = Builder.CreateCall(ReadFty, ReadFn, {Str});14. The returned value is stored in the mapNames map for later use: nameMap[Var] = Call; }15. The tree traversal continues with the expression: Node.getExpr()->accept(*this); };16. A Factor node is either a variable name or a number. For a variable name, the value is looked up in the mapNames map. For a number, the value is converted to an integer and turned into a constant value: virtual void visit(Factor &Node) override { if (Node.getKind() == Factor::Ident) { V = nameMap[Node.getVal()]; } else { int intval; Node.getVal().getAsInteger(10, intval); V = ConstantInt::get(Int32Ty, intval, true); } };17. And last, for a BinaryOp node, the right calculation operation must be used: virtual void visit(BinaryOp &Node) override { Node.getLeft()->accept(*this); Value *Left = V; Node.getRight()->accept(*this); Value *Right = V; switch (Node.getOperator()) { case BinaryOp::Plus: V = Builder.CreateNSWAdd(Left, Right); break; case BinaryOp::Minus: V = Builder.CreateNSWSub(Left, Right); break; case BinaryOp::Mul: V = Builder.CreateNSWMul(Left, Right); break; case BinaryOp::Div: V = Builder.CreateSDiv(Left, Right); break; } }; }; }18. With this, the visitor class is complete. The compile() method creates the global context and the module, runs the tree traversal, and dumps the generated IR to the console: void CodeGen::compile(AST *Tree) { LLVMContext Ctx; Module *M = new Module("calc.expr", Ctx); ToIRVisitor ToIR(M); ToIR.run(Tree); M->print(outs(), nullptr); }We now have implemented the frontend of the compiler, from reading the source up to generating the IR. Of course, all these components must work together on user input, which is the task of the compiler driver. We also need to implement the functions needed at runtime. Both are topics of the next section covered in the book. Conclusion In conclusion, the process of generating LLVM IR from an AST involves multiple steps, each crucial for producing efficient machine code. This article highlighted the structure and components necessary for this task, including function declarations, basic block management, and tree traversal using a visitor pattern. By carefully managing these elements, developers can harness the power of LLVM to create optimized and reliable machine code. The integration of all these components, alongside user input and runtime functions, completes the frontend implementation of the compiler. This sets the stage for the next phase, focusing on the compiler driver and runtime functions, ensuring seamless execution and integration of the compiled code. Author BioKai Nacke is a professional IT architect currently residing in Toronto, Canada. He holds a diploma in computer science from the Technical University of Dortmund, Germany. and his diploma thesis on universal hash functions was recognized as the best of the semester. With over 20 years of experience in the IT industry, Kai has extensive expertise in the development and architecture of business and enterprise applications. In his current role, he evolves an LLVM/clang-based compiler.nFor several years, Kai served as the maintainer of LDC, the LLVM-based D compiler. He is the author of D Web Development and Learn LLVM 12, both published by Packt. In the past, he was a speaker in the LLVM developer room at the Free and Open Source Software Developers’ European Meeting (FOSDEM).Amy Kwan is a compiler developer currently residing in Toronto, Canada. Originally, from the Canadian prairies, Amy holds a Bachelor of Science in Computer Science from the University of Saskatchewan. In her current role, she leverages LLVM technology as a backend compiler developer. Previously, Amy has been a speaker at the LLVM Developer Conference in 2022 alongside Kai Nacke.

0
0
47753

article-image-connecting-cloud-object-storage-with-databricks-unity-catalog

Pulkit Chadha

22 Oct 2024

10 min read

Connecting Cloud Object Storage with Databricks Unity Catalog

Pulkit Chadha

22 Oct 2024

10 min read

0
0
47916

Optimizing Graphics Pipelines with Meshlets: A Guide to Efficient Geometry Processing

Mastering Performance Tuning with DAX Studio and VertiPaq Analyzer

Enhancing Observability with Azure Native ISV Services and Third-Party Integrations

Managing AI Security Risks with Zero Trust: A Strategic Guide

Mastering Transfer Learning: Fine-Tuning BERT and Vision Transformers

Airflow Ops Best Practices: Observation and Monitoring

Mastering PromQL: A Comprehensive Guide to Prometheus Query Language

Mastering the API Life Cycle: A Comprehensive Guide to Design, Implementation, Release, and Maintenance

Automating OCR and Translation with Google Cloud Functions: A Step-by-Step Guide

Vertex AI Workbench: Your Complete Guide to Scaling Machine Learning with Google Cloud

Trending Topics

Essential SQL for Data Engineers

Mastering Prometheus Sharding: Boost Scalability with Efficient Data Management

How to Land Music Placements and Chart on Billboard

Mastering Code Generation: Exploring the LLVM Backend

Connecting Cloud Object Storage with Databricks Unity Catalog

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access