Qlik Data Catalyst Enriching the Catalog

Hey guys, what’s going on Brian and George here with SME. And today it’s time for another exciting adventure in the series of Data Catalyst. George, I know you’ve been, you know, crushing it with the series is technically video three but the second of the demo part of the series. So what do you got for us today? So in the last video, we brought our data into Data Catalyst, I think that now it’s time to start enriching the catalog. So when we enrich the catalog, what we’re talking about is we’re adding the business definitions, the technical metadata, checking out the lineage between different entities within data catalyst itself. So this is really where the whole process of cataloging the data starts. So let’s go ahead and take a look at the demo. Sounds great, let’s do it. Before we start editing our entities, let’s take a quick look at the catalog itself. Here all of my different entities that I’ve loaded into Data Catalyst, notice that each of them has three big metrics associated with it. Operational is a composit of the relative number of finished loads and how long ago the data was last loaded. To get a higher percentage here, you need newer and more regularly updated data. Quality is tied to the relative amount of bad and ugly records. Obviously, the higher this metric is, the more ready the data is for production. And Popularity is a composite of the number of tags, comments, workflows involved and the number of published jobs the entity is associated with. While all of these are out of the box, you can configure them if need be. There are also some base metadata statistics, like entity level, row and field counts, and last load before viewing the full entity. So let’s do a quick exercise. I want to find the sales table that I onboarded the other day. Well, that’s not helpful. There are over a dozen different entities that hit that search. So let’s go back and see how we can use metadata to more easily access our data. In the detail section, I can add information related to my specific entity, I can add a business name like SME sales, as well as a business description. So I can also add tags here as well, such as George SME so I can more easily find my data. Now before we go back and look at the catalog again, let’s go look at a specific property. All of these properties can be changed to make sure the data is of the highest quality and abides by all the company’s standards. You can add your own properties as well. But I want to focus on one. So this is a post processing rule. I broken it out into a text editor for better viewing. These are regex rules that detect patterns like credit card and social security numbers. What you can do here is either prevent them from going to the good partition, or you can flag and obfuscate them so that they get taken care of. This is just one example of the power of Data Catalyst. All right, it’s time to go back and find my entity. Notice that I have the ability to search for specific attributes like tags, business name and business description. I’m going to go ahead and search for my name and easily find my data set. Okay, so let’s go check it out. So here, the end user can get a full picture of the data set. Up here, I can get a sample data set to validate what I’m looking for. And down here, I get a full list of my related entities. You can probably guess that some of these come from the same source and belong to the same schema. You also get the metadata add here as well as addition to the in addition to the relationship if they’re a child or a parent. So one last thing I want to show is lineage. Let’s go take a look at the global transactions entity. This entity lineage shows the entities and processes related to the global transactions entity, we can see the source, the technical metadata load, the data load, as well as the data workflows that prepare the entity for specific targets. This thing is incredibly useful in tracking the life cycle. So that’s it for today’s video guys. I hope you enjoyed it. Next video we will dive into how to prepare data flows to transform entities to get them production ready. If you want to learn more about Data Catalyst, or Qlik’s data integration platform, give us a comment below or reach out to us at [email protected] Until next time, Bye, guys.

Leave a Reply

Your email address will not be published. Required fields are marked *