Learning rule representations from data