Bank Loans
Warning This documentation is archived and is no longer maintained.
- Credit Score Validation
- SSN Validation
- Loan to Value Validation
- Interest Rate Validation
- Duplicate Loan Applications
- Loan Amount Validation
- Loan Completeness Validation
1. Credit Score
Business Check | Collibra DQ Feature | Manual vs Auto |
---|---|---|
Is the credit score a whole number? | BEHAVIOR | AUTO |
Is the credit score within a valid range? (between 300 - 850) |
RULE | credit_score between 300 and 850 |
Is the credit score NULL or Missing? | BEHAVIOR | AUTO |
2. SSN Validation
Business Check | Collibra DQ Feature | Text |
---|---|---|
Is the SSN format valid? | RULE | AUTO-SSN detection |
SSN is PII. | SENSITIVITY | AUTO-SSN labeled |
Is the SSN NULL or Missing? | BEHAVIOR | AUTO |
Does the SSN belong to the Applicant? | PATTERN | SSN -> first_name, last_name |
3. Loan to Value
Business Check | Collibra DQ Feature | Text |
---|---|---|
Are loan amount and asset value (home or auto) valid numbers? |
BEHAVIOR | AUTO |
95% loan to value ratio to approve? | RULE | loan / asset_value < .95 |
4. Interest Rate
Business Check | Collibra DQ Feature | Text |
---|---|---|
Interest rate between min and max allowable range for the loans credit rating. |
RULE COMPLEX |
loan l join rates r on l.credit_rating = r.credit_rating where l.rate between r.min_rate and r.max_rate |
5. Duplicate Loan Applications
Business Check | Collibra DQ Feature | Manual vs Auto |
---|---|---|
Ensure we don't issue the same loan twice. | DUPE | first_n, last_n, SSN, Address |
6. Loan Amount
Business Check | Collibra DQ Feature | Manual vs Auto |
---|---|---|
Loan amount within lendable range | OUTLIER | AUTO |
Loan amount within lendable range. Only lend money between 50K and 3M. |
RULE | loan_amount between 50000 and 3000000 |
Resulting DQ Check
-lib "/home/opt/owl/drivers/postgres" \
-cxn postgres-gcp \
-q "select * from public.loan_risk_grade where last_pymnt_d = '2019-04-01'" \
-key member_id -alias loan_risk \
-ds public.loan \
-rd "2019-04-01" \
-dl -loglevel INFO \
-h 10.142.0.29:5432/owltrunk \
-numexecutors 10 -executormemory 1g -drivermemory 4g \
-master yarn -deploymode cluster \
-sparkprinc [email protected] \
-sparkkeytab /tmp/user2.keytab -tbin MONTH \
-dupe -dupeinc purpose -fpgon -fpgkey grade \
-fpginc grade,sub_grade -fpglb 365 -fpgdc last_pymnt_d \
-record member_id -dupecutoff 60 -dupepermatchupperlimit 99
Which components did we use?
We made use of Profiles, Duplicates, Outliers and Rules in this example. The experiments were automatically cataloged and put on a job scheduler. The next time a loan issue arises, we will be able to take remediation action using the workflow Q. Over time we can see how the bank loan program is running via the report section.