How To Get Data From SharePoint With Python
- By : Mydatahack
- Category : Data Engineering, Data Ingestion
- Tags: Python, Sharepoint API
It’s sometimes convenient to have a script to get data from SharePoint. We can automate the user managed data ingesting from SharePoint. For example, business users can upload or update the user managed file and a scheduled ETL task fetch and bring it to the datalake.
Using SharePoint API is easy thanks to this amazing module called sharepy. It helps you to authenticate and make REST calls. Microsoft provides comprehensive documentation on the SharePoint REST API. To understand how it works, start off with the article: How the SharePoint Rest service works. The key to any API programming is to construct REST endpoint correctly. Working with folders and files with REST provides you with information for fetching user managed files from SharePoint.
Gocha!
(1) You need to type username and password in the console with sharepy. To pass authentication parameters in the program, I am using the customised module sharepy_custom (see How To Customise Python Module).
(2) We can get the list of files in a folder. However, adding the parameter for GetFolderByServerRelativeUrl() can be tricky. When you are working with only server url, the parameter can contain ‘/’ (GetFolderByServerRelativeUrl(‘/Shared Documents’))in front of shared document as described here.
GetFolderByServerRelativeUrl('Shared Documents')/Files")
If you are using the REST endpoint with sever and site like http://server/site/, you cannot have ‘/’ in front of Shared Document. Otherwise you will get REST Api error “Server relative urls must start with SPWeb.ServerRelativeUrl.
GetFolderByServerRelativeUrl('Shared Documents')/Files")
(3) Constructing relative url for GetFileByServerRelativeUrl can be tricky. The url includes site name and folder name as below.
GetFileByServerRelativeUrl('/Shared Documents/Form.docx')/$value",\
filename = 'document.doc')
r = s.getfile("https:// example.sharepoint.com/GroupSites/HR/_api/web/\
GetFileByServerRelativeUrl('/GroupSites/HR/Shared Documents/Team.xlsx')/$value",\
filename = 'team.xlsx')
(4) Sharepy only works for Python 3. If you use Python 2.7, you will get TypeError: super() takes at least 1 artument (0 given).
Code
We use the customised sharepy. It get the list of a file in the specified folder and write it into a json file and also download a file specified in the API endpoint url.
Enjoy!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | import sharepy_custom as sharepy import json # (1) Authenticate s = sharepy.connect("example.sharepoint.com",\ username=<username>, password=<password>) # (2) Get file information r = s.get("https://example.sharepoint.com/GroupSites/HR/_api/web\ /GetFolderByServerRelativeUrl('Shared Documents')/Files") data = r.json() file = open("sharepoint.json", "w") file.write(json.dumps(data, indent=4)) print("json file has been generated") # (3) Download file r = s.getfile("https:// example.sharepoint.com/GroupSites/HR/_api/web\ /GetFileByServerRelativeUrl('/GroupSites/HR/Shared Documents/Team.xlsx')/$value"\ , filename = 'team.xlsx') print("File Downloaded") |