Simplified and Fast Fraud Detection”
Speaker: Keith Laker
For more blog posts from JavaOne, see the table of contents
Live SQL
- free online Oracle 12C database
- Can save scripts
- Google searchable
- Each OTN (oracle tech network) users sees own copy of data. Sandboxed
- Can download data as CSV
https://livesql.oracle.com/apex/livesql/file/index.html
And for this session the live sql URL
Pattern Matching
- types – regex, sed/awk
- in SQL – row level regex
- new: pattern recognition in a stream or rows – aka can match across rows and columns
- new SQL construct MATCH_RECOGNIZE – ANSII standard; not Oracle specific
Steps
- Bucket and order the data
- This makes the patterns “visible”.
- Used order by or partition by/order by so queries are deterministic (this does not require the paid Oracle partitioning feature)
- Define the pattern
- Regular expression like pattern
- Ex: PATTERN (X+ Y+ Z+) where X/Y/Z is a boolean expression. Ex: bal < PREV(bal)
- Common qualifiers: * + ? {n} {n,} {n,m}
- Also have extra ? for reluctant qualifiers – helps deal with what to do with overlapping matches
- Define measures
- Define columns in output table
- pattern navigation options; PREV, NEXT, FIRST, LAST
- column
- optional aggregates (COUNT, SUM, AVG, MAX, MIN)
- special measures: CLASSIFIER() – which component of the pattern applied to this row and MATCH_NUMBER() – how many matches within each partition – both are good for debugging
- Ex: MEASURES FIRST(x.tstamp) as first_x
- Controlling output
- by default get a column per measure along with the partitioning column (when using one row per match). Get more columns with all rows per match)
- how many rows back: ONE ROW PER MATCH (default) ALL ROWS PER MATCH or ALL ROWS PER MATCH WITH UNMATCHED ROWS (good for debugging)
- where to start next search: AFTER MATCH SKIP PAST LAST ROW (default), also options for next row and relating to variables
Demo
- Find 3 or more small (<2K) money transfers within 30 days. Then find large transfer (?=1M) within 10 days of last small transfer
- Can do in SQL without pattern matching, but a lot of code.
- Can do in Java, but. [copying the database…]
- Showed how to create a table for JSON data – reads into a CLOB and Oracle checks it is valid JSON. Loaded with insert statements because live sql is web based and can’t access underlying file system.
- Can use dot notation to access SQL fields
Sample pattern matching statement:
SELECT * FROM transfers_view MATCH_RECOGNIZE( ORDER BY time_id MEASURES user_id AS user_id, amount AS amount PATTERN (X{3,} Y) DEFINE X AS (amount < 2000) AND LAST(time_id) - FIRST(time_id) < 30, Y AS (amount >= 1000000) AND time_id - LAST(x.time_id)< 10);
My take: This was a two hour “tutorial” which differs from a hands on lab. We were still able to follow along with a laptop or “large tablet.” I followed along with the demos on my Mac. Which also let me play a bit. It was fun. I’ve always liked SQL :). I like that he uses QR codes for the links/blogs he wants people to go to. They are also linked in the PowerPoint when it becomes available.
It was also interesting blogging on my laptop. On my tablet, I blog in HTML because it is a pain to u se the visual editor on the tablet. A laptop has no such problem. But a laptop battery doesn’t last all day so…