I’ve been using much of my time lately revising for my first Salesforce Architect exam: Certified Data Architecture and Management Designer. I’m really happy to say I passed it, so I’m going to share a few key tips with you – it won’t make you pass the exam – only you can do that – but it may help to get some more recent confirmation of the topics covered.
How I Studied for the Exam
- I started, as always, with Trailhead. I put a Trailmix together for this certification, which you are most welcome to make use of; I will update it periodically as new content comes out.
- Experience has told me not to ignore the study guide; so I then followed the study guide and made my own notes as I went through each article listed in it.
- There is a really useful webinar about large data volumes, which I watched and then I read through some real-life examples of how NOT to manage large data volumes – sadly they made familiar reading!
- Once I was familiar with those, I started working with a brilliant Quizlet set written by Patricio Penaherrera. Thank you for sharing that, Patricio – it was a really useful way to revise while I was on the home stretch.
- Another good blog to read is by Maciej Jozwiak – thank you, Maciej.
So, for those of you who’ve read the study guide, these topics won’t be a surprise to you, but where I can recall any of the question themes I will put them here for you. Good luck and please do tweet me if this has helped you!
Large Data Volumes
Read this. Comb it. Know it.
Types of API
- SOAP API
- REST API
- Bulk API
- Streaming API
- Metadata API
- Study techniques for managing data skewing:
- Ownership skew – many records with the same owner – it can slow performance
- Lookup skew – many child records against a single record – think lots of contacts against a single account
Note that skewing data can cause lock contention if you are doing lots of operations e.g. updates.
Take an account with many contacts as an example. The more contacts sitting against an account, the more likely you are to encounter lock contention, since editing many contacts means the account is locked, then unlocked for every contact that’s updated.
- Think about how you might fix your system if you had a user who owned >10k records – what might their position in the hierarchy be, knowing how it can affect performance?
- Hint: see the Sharing & Visibility Designer resources for a clue!
SOQL vs SOSL
- Know the difference between the two:
- SOSL: runs a query using a text string, across multiple objects, searches indexes first, 2k records max
- SOQL: runs a database query using SELECT, within specified objects, you choose whether to select indexed fields when writing the query, 50k records max
I found it helpful to know off by heart:
- The field types that are indexed automatically by Salesforce
- The types of fields that you can build custom indexes for
- Remember: External IDs are always indexed (that came up for me); it also helps if you can remember which field types can be external IDs
- How you can make queries run faster by selecting indexed fields as a priority
Deterministic Formula Fields
Nowhere I looked told me that these were basically formula fields that don’t pull data through from other objects or use any dynamic dates e.g. TODAY() and NOW(). So they can be indexed! I had to kind of figure it out by looking at what makes a non-deterministic formula field.
- Know what they are and how to enable them
- Be aware of the types of fields that can be included in them
- Understand what the limitations are
- You can’t add new fields to the table, you’ll need to get Salesforce to drop and re-create the table
- They only copy down to Full sandboxes
- Know how the bulk API works:
- You run a query (query/insert/update/delete/upsert)
- The Bulk API loads into temporary tables
- The Bulk API creates batches, then you submit them
- It processes the batches asynchronously
- Know when to use Serial mode vs Parallel mode and what the default is
- Know how bulk queries work (the following diagram shows my visual representation of what the materials say)
- It’s possible that you’ll be challenged on the best way to handle lock contention during a data load; I knew from the studying that the best way is to re-order the record IDs so that it handles each record in sequence – i.e. lock – insert – unlock – on to the next one.
- Lots of the materials describe what happens when the Bulk API encounters locks – I have again visualised this for you. There were a few questions in the exam about it
Primary Key Chunking
- Know what it is (a strategy for optimising queries – order record IDs
“With this method, customers first query the target table to identify a number of chunks of records with sequential IDs. They then submit separate queries to extract the data in each chunk, and finally combine the results.” – Bud Viera
- Know when to use it
- Know what issues it can solve
- Study how to improve query performance; this is where you will need to understand what types of fields can be indexed.
- Think of tools you could use to deal with dupes.
- Know which objects you can use Duplicate Management for
- Know what Clean can do to help
- Know the difference between Clean and Prospector
- Managing architecture between a legacy CRM, Salesforce, and ERP is a key element for this exam.
- Know what the best practices are – whether you should integrate all 2, whether you should use web services etc.
- It KILLED me that I couldn’t have any paper and pen to draw this out by the way, so if you can get to a test centre it’ll be better for you.
Data Quality & Governance
- Know what responsibilities fall under data stewardship, data architecture and data governance in an enterprise environment:
- Data Governance: The management of availability, usability, integrity and security. Usually includes a committee or council who defines procedures and plans the execution of those procedures
- Data Stewardship: The management of an organisation’s data assets, to give users high-quality data (e.g. a Business Intelligence / Reporting department within an organisation)
- Data Architecture: The management of the database itself
- Know that data quality has the following considerations:
- A few questions came up on how you can improve data quality; some answers included dashboards and Data.com
- Generally developing a data management plan that covers the 3 areas above is a good answer
Relationship to the Integration Architecture Exam
I’d recommend doing this exam before you try the integration architecture exam, but there are definitely some overlaps in content between the two.
The integration architecture exam includes content around backup optimisation; having knowledge of PK chunking and avoiding timeouts will stand you in good stead for this exam. I’d recommend you do these two exams relatively closely together since the information will be fresh in your mind.
That’s all I can remember for now, but I will keep updating this article as more details come back. I wish you all the very best of luck in passing your certification exam! Tweet me @gemziebeth if you’d like to ask any specific questions :0)