Documents in MCRM (SharePoint, DropBox, Google Drive, OneDrive) – a technical deep dive – Part 3
written by Jan Slodicka on August 15, 2017
In today’s post, we will examine the details of how Resco Mobile CRM works with file hosting services. Especially focusing on SharePoint which is commonly integrated with Dynamics 365/CRM. In previous installments of this series, we already defined the basic terminology and presentation of documents in app’s user interface (Part 1), as well as document filtering, viewing, editing, storage and safety (Part2).
6. Details about file hosting services
MCRM presents the documents as attachments to specific entity records. To comply with this principle, MCRM expects specific folder structure on the file server*:
<root_folder>/ // Specified by the Woodford admin **
<entity_type_name>/ // account, contact etc.
<record_name>_<record_id>/ // Folder for specific entity record
<file_name> // Cloud document attached to the record
Here is an example: /My MCRM Folder/contact/John_Smith_12aaa7f78a7f41deaf4bcaccb43df2ce/Capture_170627_171139.jpg
This folder structure can be difficult to maintain when new documents are primarily added on the server side. However, if the documents are mainly added at the client side, the whole system work effortlessly:
- User creates a cloud attachment
- The attachment is uploaded during next sync
- Remaining users get the attachment during their next sync
No conflict testing during sync
MCRM implements simplified synchronization for file hosting services consisting of two steps:
- Upload client changes (new documents, changed documents, document deletes)
- Download server changes since the last synchronization
This algorithm does not properly handle conflicts resulting from simultaneous editing of one document by multiple users Should this situation arise, the rule “Last sync wins” applies.
The communication itself is safe as all 3 services encrypt data in transit (utilizing the https protocol). MCRM stores unencrypted data in the cloud, i.e. anyone with the correct login can use it. This is both an advantage (data can be accessed by other apps) and disadvantage (less security). Whether the data stored in the cloud (“data at rest”) is encrypted or not depends on the service itself. Usually the business (Pro) versions provide data encryption.***
Note also that MCRM does not execute any user access control as far as the documents are concerned. It means e.g. that a user who does not have read permissions for a certain entity record, can see its cloud documents on the file server – provided he has the cloud drive access, of course.****
If we compare cloud data security to the security of locally stored MCRM data (database, downloaded documents), then MCRM adds additional security ranging from data encryption up to the optional security measures.
Summary: File hosting services provide a rather low-level of security. Highest risk presents a weak user password or its leakage.
*) This structure was inspired by the folder structure used by Dynamics CRM / SharePoint integration.
**) Root folder must not be empty. If empty, MCRM will refuse to sync cloud documents. (The sync log will contain warning of the type “WARN: OneDrive: RootFolder=null, sync skipped”)
***) When it comes to security of data at rest, we talk about threats such as stealing the cloud hardware, illegal access to cloud data etc.
****) This is not going to change in the future as the supported cloud services themselves do not provide efficient tools to implement user access control.
7. SharePoint details
How SharePoint organizes the data
At the topmost level the data is divided into site collections. Site collections consist of sites (site collection itself is a site too).
Sites contain lists. List is a collection of data of some type. SharePoint lets you use various list types – list of contacts, list of tasks, list of surveys, list of files, etc.
MCRM uses only the last mentioned list type: So-called document libraries (or just libraries) are special lists that contain files belonging to the library root folder or to its subfolder.*
*) SharePoint lists can contain files too – so-called attachments. However, attachments do not have any special support. Unlike attachments, library files have versions, can be checked out and checked in, document content is subject of full text search etc.
SharePoint integration with Microsoft Dynamics
Dynamics CRM sees SharePoint as a set of document libraries that store documents related to CRM records; other SharePoint aspects are ignored.
Document libraries contain subfolders related to some CRM record; files located in these subfolders are record attachments (different organizations are also possible, but that is beyond the scope of this article).
In the standard integration, CRM admin selects entities (account, contact…) that can have SharePoint attachments. In turn, respective SharePoint document libraries are created: One library per one entity with the library names identical to the entity names.
Formally the Dynamics server contains 2 entities that describe the locations of SharePoint documents that are managed by Dynamics CRM**:
- sharepointsite entity records describe SharePoint sites.
- sharepointdocumentlocation records (“locations”) contain a URL (subfolder of some SharePoint document library) and a reference to the CRM record (regardingobjectid) to which the documents are attached.
Document location A (sharepointdocumentlocation record) has these attributes:
ID = 62b1c3ce-82df-e411-8100-fc15b4263b68
absoluteurl = /contact
regardingobjectid = null
This location has no relation to any particular CRM record (regardingobjectid is not set); it only serves as a parent location for other contact-related locations.
Document location B has these attributes:
parentsiteorlocation = sharepointdocumentlocation[62b1c3ce-82df-e411-8100-fc15b4263b68]
relativeurl = John Smith_CBD10AE0FC33E6118125005056A66018
regardingobjectid = contact[cbd10ae0-fc33-e611-8125-005056a66018]
SharePoint server has a document library called “Contact” with the RootFolder=”/contact”.
This document library contains subfolder “John Smith_CBD10AE0FC33E6118125005056A66018” with these files:
When you open the John Smith contact (ID cbd10ae0-fc33-e611-8125-005056a66018) in MCRM, these files will be listed in the Cloud Documents section.
This document library stores cloud documents for another 491 contacts (we won’t list them here, of course).
The above example is typical in a sense that it uses the default folder structure***:
<entity_type_name>/ // Document location RootFolder
<record_name>_<record_id>/ // Subfolder for specific entity record
<file_name> // Cloud document attached to the record
The clients, however, might use different names that better fit their needs. That’s ok; MCRM will traverse the SharePoint document libraries, that belong to the sites listed in sharepointsite table and match them with the folders specified in sharepointdocumentlocation table. Finally, MCRM will record
the information about the CRM documents stored at the SharePoint server. Depending on the SyncFilter some of these files will be downloaded to the mobile device, while the remaining will be available for download on demand.
The situation looks differently when MCRM user creates a new SharePoint document and wants to attach it to some entity record. The question is: To which location the document should be added, so that it makes sense for Dynamics users.Here is the algorithm used by MCRM:
- If MCRM finds a single location that is pointing (regardingobjectid) to that entity record, the document will be added the respective location folder
- If MCRM finds multiple locations, the user is prompted to choose one of them
- If MCRM does not find any location at all, MCRM looks for a suitable parent location (“/contact” for contacts). If found, a new child location is created using the default naming convention. Otherwise the document is refused (MCRM displays an error message “SharePoint not configured for …”)
Here is the basic schema of the MCRM/SharePoint synchronization:
- Entities sharepointsite and sharepointdocumentlocation are synchronized similarly to any other CRM entity (e.g. SyncFilter applies).
- SharePoint documents changed at the client (created/edited/deleted) are uploaded to the SharePoint server one by one.****
- For each site in sharepointsite table:
- MCRM asks SharePoint server for all site lists (GetListCollection web request)
- Among the returned lists MCRM selects those which represent document libraries (the remaining lists are ignored)
- Among the document libraries MCRM selects those whose RootFolder matches some sharepointdocumentlocation record (more precisely its URL)
- Documents belonging to these document libraries are synced against MCRM local storage.
The basic sync schema looks straightforward, however there are problems that often cause unacceptable sync performance.
First of all GetListCollection response is incomplete and MCRM has no choice except to send extra GetList web request for additional list details.***** You can imagine what this means if the SharePoint server uses many lists. If you encounter this problem, use SharePointListMap as described in the tip bellow.
Also, any unnecessary records in sharepointsite/sharepointdocumentlocation tables imply unneeded web requests. Drop these records if possible, or – if you cannot modify these tables – use SyncFilter to avoid unwanted entries.
****) As for files, there is no conflict resolution (“Last sync wins”).
*****) List’s RootFolder is not returned, hence the list cannot be matched against sharepointdocumentlocation records.
Tip: Using SharePointListMap
The following applies to a situation when communication with the SharePoint server takes too long because there are too many SharePoint document libraries.
As explained above, the basic problem is extra GetList web requests due to missing root folder information. This may take ages, hence MCRM caches the root folders permanently so that GetList requests can be avoided during next sync. This optimization limits the problem to FullSync only.
Now you can go even further and export cached root folders to other clients to speed up their FullSync.
Ideally the Woodford admin executes the synchronization:
- Open the Setup form > Accounts > SharePoint > Export SharePoint List Map
- An email opens with instructions and data.
- By following these instructions you’ll distribute the SharePoint List Map to the remaining clients.
Worst SharePoint scenario I ever came across
We had a customer testing SharePoint integration and they complained about 20+ minutes sync times even with just minimal file transfers.
Long story short, here is the relevant excerpt from the SyncLog that contains SharePoint stats:
<SharePointDownload Tim=1298391ms GetListCollection=17x/88673ms GetList=22712x/1147765ms GetListItems=43x/23324ms #FileInfosSaved=4 #NotDownloaded=3/>
<FileDownloadQueue Threads=10 Tim=912140ms Downloaded=1 TotalDownloadSize=101K Speed=0.001Mbps/>
Overall result was 1 file downloaded from SharePoint server in a whopping 1298 seconds. Customer was right. So what’s going on?
Try to read the stats left to right:
- The project has 17 SharePoint sites. (Because of 17 GetListCollection calls. These calls take 88 seconds (Too long … slow server?)
- These 17 sites have together 22712 lists
- Out of these 22712 lists only 43 are relevant to Dynamics server (GetListItems), i.e. only these lists match some sharepointdocumentlocation record
- These 43 relevant lists are mostly empty as only 4 files were found (#FileInfosSaved)
- Out of these 4 files only 1 was downloaded (because MaxAttachmentSize was set to 1 MB); the remaining ones are ready to be downloaded on demand
- FileDownloadQueue stats are not relevant in this case. Apparently only 1 file was downloaded and the queue kept waiting for another 900 seconds (hence the low speed)
- Too many sites, too many lists, too few CRM-related lists, 4 relevant files only… simply strange
- This might be because this was just a test; production environment might show different numbers
- There is one huge thing that can be done: If customer’s Woodford admin generated SharePointListMap, we could save 1147 seconds devoted to GetList calls
The problem in this case is in the download of the SharePoint control information. Simpler organization (e.g. if Dynamics/SharePoint integration used just 1 site) would bring enormous improvements (must be decided by the customer). By using SharePointListMap the total download time could be cut from ~1300 seconds to ~150 seconds.
Just briefly: SharePoint is a powerful content management system that provides complex security, incl. full-fledged user access control.* MCRM connects to a working Dynamics/SharePoint integration and assumes that security permissions between the two systems are set up properly. If the Dynamics CRM/SharePoint integration is considered secure, MCRM (when properly set up) won’t change the picture.
*) 3rd party tools such as CB Replicator are needed to automatically synchronize Dynamics CRM privileges with SharePoint permissions.
We’ll be wrapping up this technical deep dive in the upcoming week with Part 4 – focusing on the file download performance and further tips that allow to effectively handle documents within Resco Mobile CRM.