SFMC Asset Relationship Finder: Discover Where Your Assets Are Used
View from Data Extension search

SFMC Asset Relationship Finder: Discover Where Your Assets Are Used

In my 8 years of experience with Salesforce Marketing Cloud, I have always envisioned a solution that could help users understand the relationships between assets—essentially, which asset is linked to which, and how.

Questions like:

  • "What happens if I delete this Data Extension?"
  • "This query seems to be in use, but in which automation?"
  • "Is this unsubscribe page associated with any emails?"
  • "Which CloudPages are using this CloudPage to send data?"

And many others often arise. Unfortunately, there hasn't been an easy way to answer these. You either have to write custom SSJS code for each use case, enduring some painful moments if you manage many assets, or manually sift through APIs or the UI to find the answers.

In the end, I set out to build a solution that brings everything together in one place.

See the code

Key Features

The SFMC Asset Relationship Finder is designed to address these exact challenges by providing an efficient way to explore and understand the relationships between various Salesforce Marketing Cloud assets. Here are some of its standout features:

  • Asset Relationship Discovery: Quickly identifies the relationships between different SFMC assets, including Data Extensions, Automations, Emails, and Cloud Pages.
  • Concurrent API Calls: Ensures rapid, simultaneous API interactions, making the application faster and more responsive even when managing multiple asset relationships in real-time.
  • Multi-Asset Type Support: Supports searching and relationship mapping for multiple asset types (Data Extensions, Cloud Pages, Emails, Automations).
  • Smart Asset Filtering: Filter assets by name or key, and view detailed relationships to other assets within SFMC.
  • Efficient Search & Display: Presents asset information in a clear and concise format, including which automations use specific Data Extensions or which Emails reference Cloud Pages and etc.
  • Intelligent Caching: Utilizes a "Go Back" feature to rapidly create multiple Data Extensions without data loss.
  • Enhanced User Interaction: Includes a “View More” feature for long lists of relationships, allowing users to expand or collapse results as needed without overwhelming the dashboard.


View from Scripts search

Important Considerations

  • API Intensity: When dealing with large assets like Journeys and Emails, expect higher API consumption due to the extensive search operations. This can impact your API quota. Details will be shared later in this article.
  • Loading Behavior: If your results have numerous items and you click "View More," be prepared—the app will load all items in one go. If you’re working with large datasets, this may not be the behavior you want. Consider alternative approaches if performance becomes a concern.
  • SOAP API Limitation: Keep in mind that pagination was not included for SOAP APIs that allow 2500 items in each call, meaning you’ll need to work within these constraints when managing large datasets.
  • Custom Regex & Filters: The app relies heavily on complex regular expressions and tailored filtering. These have been fine-tuned for common use cases, but can easily be adjusted to suit your specific needs, which we’ll explore in more detail throughout the article.
  • Testing Limitations: Due to the sheer variety of SFMC setups, it wasn’t feasible to test every possible use case. If you encounter scenarios where the logic needs adjusting, feel free to reach out—I’m open to suggestions for improvement.

Technical Dive-in

Why Go (Golang)?

Before settling on a language, I did some research and had discussions with ChatGPT. Initially, I had no experience with Golang, but it quickly stood out. The more I learned about it, the more I appreciated how lightweight and efficient it is.

In this application, we make multiple API calls to Salesforce Marketing Cloud, and Go's goroutines allow us to handle these calls concurrently, optimizing performance without adding unnecessary complexity. This is particularly important when dealing with large datasets or filtering through multiple assets. Go efficiently manages concurrency, reducing wait times and speeding up processing.

To give a more detailed example: let’s say you want to check if a Data Extension is used by both Script Activities and CloudPages. Assume you now have 400 CloudPages and 350 scripts. Given that SFMC’s maximum page size is 50, this would mean:

  • 8 pages for CloudPages (400 ÷ 50).
  • 7 pages for scripts (350 ÷ 50).

Let’s assume it takes 0.7 seconds to fetch one page of CloudPages and 0.5 seconds for scripts. If you process these sequentially, you'd first wait for all the CloudPages and then for the scripts. Here's how the total time adds up:

  • 8×0.78 \times 0.78×0.7 seconds = 5.6 seconds for CloudPages.
  • 7×0.57 \times 0.57×0.5 seconds = 3.5 seconds for scripts.

In total, the non-concurrent processing time would be 5.6 + 3.5 = 9.1 seconds.

However, with Go’s concurrency, both requests would be processed simultaneously, meaning you only wait for the longer duration—in this case, CloudPages. So the total wait time becomes just 5.6 seconds, rather than 9.1 seconds.

Now, you can think what about checking for other assets and how much time would be needed for all operations when it is done with non-concurrent processing.

Go's strong type system is also a significant advantage. It helps avoid common pitfalls when parsing and handling complex responses from SFMC’s APIs, such as asset relationships across data extensions, automations, emails, and cloud pages. The code benefits from Go's simplicity and strictness in managing these relationships, providing cleaner, more maintainable code compared to dynamic languages like Python or Node.js. This strictness helps in avoiding runtime errors, making the development process more predictable and manageable.

Moreover, Go's performance shines when dealing with pagination and SOAP API requests in Salesforce Marketing Cloud, which require handling large data sets efficiently. Python or Node.js would likely require more resources to manage these at scale, whereas Go handles these tasks in a lightweight and performant manner.


Another view for Data Extension Search with multiple options

Data Extensions

At the core of the app are Data Extensions, as they can associate directly or indirectly with almost all other assets. This makes understanding these relationships crucial, which is why there are 10 different options for discovering Data Extension connections.

Users can submit a request by entering either the Data Extension Name or CustomerKey. Upon submission, the app first verifies the existence of the Data Extension. This search happens in two stages: First, it checks regular/non-shared Data Extensions. If no match is found, the search extends to shared Data Extensions. If the Data Extension is not found in either case, an error message is returned: "This data extension is not found."

When the data extension is found, we will not just use the name of it, its ObjectID and customerKey will be also valuable in some cases.

There is also a small cache system to optimize performance and reduce redundant API calls. This cache stores results for 10 minutes based on the Data Extension's ObjectID. For example, if a user searches for a Data Extension using four different options, the results are cached for 10 minutes. If the user wants to search for something else related to that Data Extension within that time frame, the app retrieves the results from the cache instead of making new API calls, ensuring faster response times and lower API usage.

  • Name (optional)

This is not selectable. When user enters CustomerKey, they see the Name of the Data Extension as well thinking that they might not know the name.

  • Path of this Data Extension

This one is quite straightforward because there are lots of examples online to find the path of data extensions. It's important to note that adding QueryAllAccounts is essential when the data extension is shared, as indicated by the ContentType being shared_dataextension in XML request envelope. Without this, folders for shared data extensions cannot be retrieved."

This process involves several SOAP calls to retrieve DataFolders, depending on how deep the data extension is nested.

More details can be found here (Thanks Ivan Razine ):

https://ampscript.xyz/how-tos/how-to-retrieve-all-shared-data-extensions-using-ssjs/

  • Queries targeting this Data Extension

Another straightforward one, done without putting so much effort in it. A regular SOAP call for QueryDefinition object to filter DataExtensionTarget.Name with the name of your data extension is enough to get the right results.

  • Queries including this Data Extension

Initially, I was concerned that retrieving all queries and manually searching through the content of each one for references would be necessary, but it turned out to be much simpler. Checking QueryText with like operator makes it work. I tested with Data Extension names that contain spaces or shared Data Extensions—both of which are treated differently in SFMC queries—and it worked in all cases.

  • Import Activities targeting this Data Extension

For Import Activities, ImportDefinition must be retrieved by filtering DestinationObject.ObjectID with Data Extension ObjectID. However, the results often include many system-generated Import Activities, which need to be carefully filtered out. To address this, I created a regex to exclude Import Activities whose names are just hashes. If this doesn't cover all cases, the regex can be easily adjusted.

  • Filter Activities targeting this Data Extension

For Filter Activities, FilterActivity must be retrieved by filtering DestinationObjectID with Data Extension ObjectID. Similar to Import Activities, there are many system-generated Filter Activities that must be filtered out. To handle this, I designed a regex to exclude Filter Activities with names that either have hashes in them or start with "Activity for result group". If necessary, this regex can also be updated to refine the filtering criteria.

Here, I also added DestinationTypeID = 2 filter, meaning only Filter Activities targeting data extensions will be focused while querying via SOAP API.

Error handling for Data Extensions

  • Content Builder Emails using this Data Extension

One of the most enlightening challenges in developing this app was working with the /asset/v1/content/assets/query endpoint to search for emails or CloudPages.

Querying assets is relatively simple, but do you always get the exact results you're after? Unfortunately, it's not always that straightforward. Here's an example:

Let’s say you have a Lookup function within an AMPscript block that queries a Data Extension named "Abc123". A typical query would look like this (without mentioning the other parts like filtering for right assetType.name and so on):

            "property": "content",
            "simpleOperator": "mustcontain", or "contains" 
            "value": "Abc123"        

Did it return the email you're looking for? If it did, you’re in luck! However, after numerous attempts, I found that this often doesn't work as expected. The content search simply wouldn’t return the email.

Through trial and error, I discovered that the following format yields the expected results:

            "property": "content",
            "simpleOperator": "mustcontain",
            "value": "%\"Abc123\"%"        

It turns out SFMC handles content with AMPscript or SSJS blocks differently, especially when escape characters like / are involved. Unfortunately, even with this workaround, it’s not foolproof and doesn't cover every scenario. I’ll delve deeper into the specifics of this issue when we discuss CloudPages.

I used "templatebasedemail" and "htmlemail" as assetType with OR operator.

  • User-Initiated Emails using this Data Extension

So far, all the results have been filtered in real-time, except for cases where we exclude system-generated results. However, for User-Initiated Emails, real-time filtering isn’t feasible because SendDefinitionList.CustomObjectID is not a filterable field. This means that instead of applying a filter upfront, we need to retrieve all User-Initiated Emails (via EmailSendDefinition) without any filters.

Once the full list of User-Initiated Emails is retrieved, we can then manually check each entry’s SendDefinitionList.CustomObjectID to see if it matches the Data Extension’s ObjectID.

  • Journeys using this Data Extension as Entry Source

The biggest challenge for the "Data Extension" part of the app was retrieving Journeys and linking them to Data Extensions through Event Definitions. The complexity didn’t lie in the technical difficulty but in finding the most efficient way to handle this with concurrent API calls. Why? Because another object, Event Definitions, store the Data Extension name, but you can’t filter Event Definitions by Data Extension name directly. This left us with two possible approaches:

  1. Search All Event Definitions First: We could retrieve all Event Definitions, search for those related to the Data Extension, and then call the Journeys API using the Event Definitions by filtering "Journey Name equals Event Definition Name."
  2. Search Journeys First: Alternatively, we could first retrieve all Journeys, extract the Event Definition key (which is available in most cases), and then search for the related Event Definitions—first by key and then by name as a fallback. When searching by name, we also need to order the results by CreatedDate to ensure we retrieve the latest Event Definition since a Journey might have multiple versions using different Event Definitions having same name which Journey Name.

I chose option 2 for a few key reasons:

  • The number of Event Definitions often exceeds the number of Journeys. Event Definitions persist even when Journeys are deleted, and changing the entry source of a Journey in different versions creates additional Event Definitions.
  • Event Definitions can’t be filtered by their active status, but Journeys can. If you only want to search active Journeys, filtering becomes more efficient when you start with Journeys.
  • Journeys can’t be filtered by Event Definition Key, but Event Definitions can be filtered by the keys extracted from Journeys. This allows for more targeted searches and faster results.

In the end, option 2 provided a more streamlined, efficient approach, especially when handling large datasets and numerous Event Definitions.

After this, we had to tackle the issue of concurrency. SFMC allows only 100 concurrent REST API calls at a time, so I implemented a strategy to prevent hitting this limit. After retrieving all the Journeys with their Event Definition keys and names, I batch the Event Definitions in groups of 50 and introduce a 0.5-second delay between batches. This ensures we stay within SFMC's API rate limits while still making concurrent API calls to retrieve the Data Extension name from each Event Definition.

This approach helps maintain efficiency, especially when additional API calls are required based on the user’s selection, such as when they include Scripts or CloudPages in their search, or when other calls are made for different apps/solutions during this time.

Another challenge involved sanitizing URLs while searching for Event Definition names. Simple URL sanitization wasn’t enough for certain cases. For example, names like "This_is_a_name_for_the_journey[one-two-three]" could return zero results even after sanitization. To handle this, I added a custom function that truncates the name before any special characters like "[". So, the name "This_is_a_name_for_the_journey[" could bring the result for the search. This logic might need adjustment depending on specific naming conventions, especially for Journey names that start with special characters like "[".


Example view for CloudPages

  • Scripts Activities including this Data Extension

Script Activities, unfortunately, don't offer the same flexibility as Queries when it comes to searching through their content. The only field that can be filtered is the name. As a result, retrieving all Script Activities concurrently is necessary, and then manually searching for the data extension name within their content.

For this search, I opted not to narrow it down by specifically targeting AMPscript functions, as data extension names can also be assigned to variables, making it harder to pinpoint exact references. However, if you want to refine the search further—for example, limiting it to instances where data extension names are explicitly used in AMPscript or other functions—you can modify the code to meet your specific needs.

  • CloudPages including this Data Extension

Interestingly and unfortunately, I couldn’t make the queries I used for email searching work for CloudPages. As a result, I had to adopt a similar approach to what I did for Scripts—retrieving all CloudPages concurrently and going through the content manually. The same considerations I mentioned for Scripts apply here as well; the logic can be adjusted to fit specific needs if required. I only used "webpage" as assetType.

Here, I was sneakily trying to explore if there was a way to refine the search for Cloudpages and thanks a lot to Adam Spriggs who tried to help:

https://salesforce.stackexchange.com/questions/425509/searching-through-the-content-of-a-cloudpage

Automation Activities (Scripts, Filters, Imports, Queries)

For Automation Activities, the main goal is to determine if a given activity is part of any automation. Since it's relatively easy to find the path of an Automation Activity, I chose not to focus on that aspect here.

When the user inputs the name of an activity, the app searches for it in the relevant object (Scripts, FilterActivity, ImportDefinition, QueryDefinition) to retrieve its ObjectID/ssjsActivityId. If the activity is found, we then check the Activity object through the SOAP API to determine if it's being used in any automation. The Activity object essentially serves as a "pointer" to Automation Activities when they are part of an Automation, acting like a mirror of the original activity.

If a result is found, we extract the Program.ObjectID associated with that activity and then search the Program object for a matching ObjectID. In the context of Salesforce Marketing Cloud, the Program object is essentially the API name for Automations. So, by searching the Program object, we are actually looking for Automations. If a matching automation is found, we can confirm that the activity is in use. If no automation is found, we return an error message like "No automation found for this activity."

CloudPages

Another interesting challenge was searching for CloudPages, as it’s not possible to retrieve the PageID that we use in the AMPscript CloudPagesURL function—whether it's in Emails or other CloudPages—via the API.

Therefore, when a user inputs a CloudPage ID (which we assume to be the PageID), there is no reliable way to validate whether a CloudPage with that specific ID actually exists. This led to the approach of simply searching the content of both Emails and CloudPages for the user-provided ID. Essentially, this method involves retrieving the content from both asset types and performing a straightforward content search, rather than using more sophisticated lookup techniques. Hence, you don't have to only search CloudPage ID here, you can search all other things you want to search in Emails or CloudPages if it is necessary.

Emails

For Emails, we accept both Name and ID as inputs. However, the ID we accept is not the asset ID but the Email ID (like the PageID for CloudPages) used to match emails with other objects. Similar to how we handle Data Extensions, if the user provides the Email ID without the Name, we return the corresponding Name of the Email along with the results.

After the user inputs an Email Name or ID, we first search for the email to confirm its existence. Once a match is found, we proceed based on the options the user has selected.

  • For Journeys including the Email, we skip dealing with Event Definitions and simply append &extras=activities to our API call. This allows us to search the response to find which Journeys include that Email as part of their items.
  • For User-Initiated Emails using the Email, as explained earlier, we retrieve all User-Initiated Emails and compare the Email.ID field to check for a match with the ID provided by the user.
  • For Triggered Sends using the Email, we query the TriggeredSendDefinition object, filtering by Email.ID to match the Email ID the user is querying. This makes it possible to filter Triggered Sends in the request by Email ID.

If you wonder why we did not include a search for Triggered Sends related to Data Extensions, the reason is Data Extension information cannot be retrieved in the context of Triggered Sends, unlike with Emails.

API Usage

As mentioned earlier, this app can consume a significant number of API calls, depending on the options selected and the frequency of use since all data is retrieved in real-time. Let’s break down these calls with an approximate calculation:

  • Session Initiation: Every session begins with two API calls—one for authentication and one for retrieving user information (getUserInfo), which is essential for identifying the enterprise ID when working with shared data extensions.

Data Extensions

  • Retrieving the Data Extension: One call for regular data extensions. Two calls for shared data extensions, as the first call checks if it's a regular data extension.
  • Path of the Data Extension: Approximately two calls, depending on how deeply nested your data extension is.
  • Queries Targeting this Data Extension: One call.
  • Queries Including this Data Extension: One call.
  • Import Activities Targeting this Data Extension: One call.
  • Filter Activities Targeting this Data Extension: One call.
  • Content Builder Emails Using this Data Extension: One call.
  • User-Initiated Emails Using this Data Extension: One call.
  • Journeys Using this Data Extension as Entry Source: This varies. For example, if you have 200 journeys, there will be 4 calls to retrieve the journeys and an additional 200 calls for event definitions (since each journey will require a separate event definition call). This adds up to approximately 204 calls.
  • Script Activities Including this Data Extension: Depends on the count. For 100 scripts, this results in 2 calls.
  • CloudPages Including this Data Extension: Similarly, for 100 CloudPages, expect 2 calls.

Impact of "Journeys Using this Data Extension"

When the "Journeys Using this Data Extension" option is enabled, the total API calls can quickly escalate—roughly 220 calls when there are many journeys, compared to just 17 calls when it's not used. Given this, frequent use of this feature without optimization can significantly impact performance and resource consumption.

Optimization Suggestions

To mitigate potential issues with high API consumption, you could implement caching or database solutions, especially if this feature will be used frequently. Relying on real-time API calls for every request without optimization may lead to performance bottlenecks and increased API usage costs.

Emails/CloudPages

Searching for emails typically doesn't generate the same high API consumption as Data Extensions. However, if you have a large number of Journeys, it’s crucial to optimize API calls. For instance, with 3,000 Journeys, every search can result in 60 API calls due to the need to paginate through results.

Similarly, searching for CloudPages isn’t as resource-intensive. But, if your account contains a large number of emails, optimization is again necessary. For example, with 3,000 emails, each search would result in 60 API calls. Same for CloudPages while searching in them.

Calls for User-Initiated Emails or Triggered Sends do not consume much because both of them only require one call.

Additional Information

Definition Requests

While working heavily with, it is really helpful to use Definition Requests to understand which fields are retrievable and which not because documentation might be confusing sometimes.

          <DefinitionRequestMsg xmlns="http://exacttarget.com/wsdl/partnerAPI">
            <DescribeRequests>
              <ObjectDefinitionRequest>
                <ObjectType>Data Extensions</ObjectType>
              </ObjectDefinitionRequest>
            </DescribeRequests>
          </DefinitionRequestMsg>        

Formulating Links for Assets

Initially, I considered adding direct links to the assets within the app. Though I decided not to include it in the final version, I’d like to share how this can be done.

For all assets (except automations), creating links is straightforward. You can dynamically build the links by retrieving stack information from the getUserInfo call, and using that stack value within the URL.

Data Extensions: 

https://mc.${stack}.marketingcloudapps.com/contactsmeta/admin.html#admin/data-extension/${DE.ObjectID}/properties/

Filters and Scripts use modal:

ttps://mc.${stack}.exacttarget.com/cloud/#app/Automation%20Studio/AutomationStudioFuel3/%23ActivityModal/${typeID}/${objectID}

Other Automation Activities use Activity Details, full page:

https://mc.${stack}.exacttarget.com/cloud/#app/Automation%20Studio/AutomationStudioFuel3/%23ActivityDetails/${typeID}/${objectID}

Journeys:
https://mc.${stack}.exacttarget.com/cloud/#app/Journey%20Builder/%23${objectID}/1

Here, the "1" at the end refers to the journey version. Without specifying it, the link won’t work, so I defaulted to 1.        

If you wonder what those typeIDs are, you can look for them here. For example, it is '43' for Import Activities.

What about Automations?

Automations use a link formation like:

https://mc.${stack}.exacttarget.com/cloud/#app/Automation%20Studio/AutomationStudioFuel3/%23Instance/${someID}        

However, the ID required for this is not the typical UUID format that SFMC uses for Object IDs. After investigating through Network tabs and various API calls, I discovered an unofficial endpoint SFMC uses for retrieving this ID:

https://baseURL.rest.marketingcloudapis.com/legacy/v1/beta/automations/automation/definition?$skip=0&$itemsPerPage=50        

This API uses pagination through skip=0, skip=1, etc., unlike typical REST APIs that use page.

The response will contain the following structure:

{
"startIndex": 0,
"itemsPerPage": 50,
"totalResults": 93,
"entry": [
    {
        "id": "someID",
        "key": "someKey",
        "createdDate": "2021-06-02T22:42:16.51Z",
        

That someID is what you need to generate the automation link.

Important note: Please do not use this endpoint in a production environment, as it is an unofficial API and not recommended for production use.

Conclusion

Building the SFMC Asset Relationship Finder was not just about creating a tool to simplify asset management—it was a personal journey of growth and learning. It was a huge challenge for me to dive deep into Golang, a language I initially knew little about, while simultaneously developing an entire package to uncover asset relationships, something I’ve always wanted to achieve.

I’m also proud to have created the first public Golang app for Salesforce Marketing Cloud, as I couldn’t find any on GitHub before this :)

I hope this app brings as much clarity and ease to others as it did to me. It’s more than just a solution; it’s the result of passion, curiosity, and the desire to make the complex simple.

Arthur Noort

Automation Consultant & Solution Architect

2w

Seems like a great tool. Just wondering, why not include searching inside content blocks too? Would that be too resource intensive?

Like
Reply
Raheem Shaik

10x Salesforce Certified

3w

Outstanding thought & implementation

Ali Riza Tuncer

Founder & CHO @beynart | Marketing & Brand Strategist | Digital Transformation Expert | Martech, Design, Web3 & Crypto Enthusiast

3w

Great job Selim 👏🏼 Thank you for sharing this and new ideas 🫶🏼

Taoufiq BENALLAH

Senior Marketing Cloud Developer & Consultant

3w

How can you retrieve queries using the data extension without parsing all the Instance queries ? or the calls are already so long ? Nice APP

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics