Exporting LDAP Data to JSON or LDIF Format Using Python 3 and the LDAP3 Library

In this post, we’ve put together a python code example for connecting into LDAP over SSL, performing a search, and exporting the data to JSON or LDIF. Then, using JSON, massaging the data into a simpler JSON format for each entry. Note that pandas and numpy are included as libraries with the import statement, but not used in this example. In a future example, we hope to show how to use pandas to analyze LDAP data. Python was chosen as it is a typeless programming language that is reasonably forgiving in many regards. Python is relatively easy to read, doesn’t require a lot of code to get results, has a lot of libraries, and a great support community.

The code we’re using covers the specific case of gathering attributes on users to help determine licenses used for an IDM product. This can easily be modified to suit your needs. Note that based on this document’s formatting, some lines may be truncated. With python using tabular alignment rather than curly braces, etc., it should be easy to distinguish what code is supposed to extend on the same line.

A quick note about this example: lines with two pounds signs are specific for this document.

## Used Python 3.6 on a mac in writing this.

##What python libraries are used. 
from ldap3 import Server, Connection, ALL, Tls
import ssl
import os
import json
import numpy as np
import pandas as pd
from ldap3.core.exceptions import LDAPSocketOpenError

## connection based parameters, as well as logging and LDAP search information
logging = False
host = ''
ldapPort = 636
useSSL = True
pwd = 'microfocus1234'
loginID = 'cn=admin,ou=sa,o=system'
userBase = 'o=data'
driverSetBase = 'o=system'
## connecting to database using the above parameters

    #tls_configuration = Tls(validate=ssl.CERT_REQUIRED, version=ssl.PROTOCOL_TLSv1_2, ca_certs_file='cacert.b64')
    #server = Server(host, port=ldapPort, use_ssl=True, tls=tls_configuration)
    server = Server(host, port=ldapPort, use_ssl=useSSL)
    conn = Connection(server, user=loginID, password=pwd)
except LDAPSocketOpenError as e:
    print("error in bind", conn.result)

## Now that we have a connection, let's do some searches. This section writes out data to a entries.json file in the local path.

#Search for all users with associations 
conn.search(userBase, '(&(objectclass=inetOrgPerson)(DirXML-Associations=*))', attributes=['DirXML-Associations','fullName','cn','loginDisabled','lastLoginTime'])
count = (len(conn.entries))
x = 0
print('\n\nTotal number of users with associations: ', count)

f = open("./entries.json","w+")
while (x < count):
    if logging == True:
    f.write(conn.entries[x].entry_to_json(raw=False, indent=0, sort=False, stream=None, checked_attributes=True) + "\n")
    x = (x+1)

print('\t\t(see the file: entries.json)\n\n')

###End of Users with Associations

##Another search to get a count of all users in the system based on the search ##base

###Search for all users in system
conn.search(userBase, '(objectclass=inetOrgPerson)')
count = (len(conn.entries))
print('\nTotal number of users in the system (based on the defined search base): ', count)
###End of all user count

##Return a list of all drivers in the system with trace levels, etc.

###Search for all drivers
conn.search(driverSetBase, '(objectclass=DirXML-Driver)', attributes=['cn','DirXML-TraceLevel','DirXML-TraceFile','DirXML-DriverImage'])
countdrivers = (len(conn.entries))
print('\n\nTotal number of Drivers: ', countdrivers)
f = open("./drivers.json","w+")

while (x < countdrivers):
    if logging == True:
    f.write(conn.entries[x].entry_to_json() + "\n")
    x = (x+1)

print('\t\t(see the file: drivers.json)\n\n')
###End of search for drivers

##The ldap3 library doesn't export the data to a json format that is conducive ##for pandas. The below section manipulates the data into a pandas friendly ##format. You can stop here and ignore the rest of the code and just use the ##entries.json file as needed. You can also change the above ".entry_to_json() ##and instead use to_ldif as well.

###clean up json in entries.json
filename = 'entries.json'
f2 = open(filename, 'a')
with open(filename, "r+") as f:
    old = f.read()  # read everything in the file
    f.seek(0)  # rewind
    f.write("[\n" + old)  # write the new line before
with open(filename, 'r') as f1:
    filedata = f1.read()
    filedata = filedata.replace('}\n{', ',{')
    filedata = filedata.replace('[{', '[')
    filedata = filedata.replace('"attributes": {\n', '')
    filedata = filedata.replace('"attributes": {\n', '')
    filedata = filedata.replace('}]', ']')
    filedata = filedata.replace('}\n}\n]', '}\n]')
with open(filename, 'w') as f2:

## This code further cleans up the JSON data to separate the DirXML-##Associations so that each user has a new line for each value on the ##association.

#print('Cleaned JSON\n')
data = json.loads(filedata)

#if os.path.exists("cleaned.json"):
#    os.remove("cleaned.json")

with open('cleaned.json', 'a') as json_file:
    i = 0
    while i < len(data):
        lenAssociations = (len(data[i]['DirXML-Associations']))
        workingdata = (data[i])
        if lenAssociations < 2:
            json.dump(workingdata, json_file)
        if lenAssociations > 1:
            ii = 0
            for value in workingdata["DirXML-Associations"]:
                dn = workingdata["dn"]
                disabled = workingdata["loginDisabled"]
                cn = workingdata["cn"]
                association = workingdata["DirXML-Associations"][ii]
                lastLogin = workingdata["lastLoginTime"]
                fullName = workingdata["fullName"]
                json_file.write("{\"dn\": \"[" + str(dn) + "]\", \"loginDisabled\": [" + str(lastLogin) + "], \"cn\": [" + str(cn) + "], \"DirXML-Associations\": [\"" + str(association) + "\"], \"lastLoginTime\": [" + str(lastLogin) + "], \"fullName\": [" + str(fullName) +"]}")
                ii = ii + 1
        i = i + 1

##Further cleanup to get each user on the same row and clean up the format so that it is valid JSON

with open('cleaned.json', 'r') as f5:
    jd = f5.read()
    jd = jd.replace('[[', '[')
    jd = jd.replace(']]', ']')
    jd = jd.replace('\'', '\"')
    jd = jd.replace('}{', '},{')
    jd = jd.replace('}\n{', '}\n,{')
    jd = jd.replace('[\n{', '[{')
with open('cleaned.json', 'w') as f6:

Questions, comments or concerns? Feel free to reach out to us below, or email us at IDMWORKS to learn more about how you can protect your organization and customers.

Leave a Reply

Your email address will not be published. Required fields are marked *