Anonymous 1732
I'm trying to run the following query through pandasql, but the output I get is not what I was expecting. I was expecting to get a table with exactly 800 rows as I am selecting the only employee_day_transmitters of the table employee_days_transmitters, but what I get is a table with more than 800 rows. What's wrong? How can I get exactly 800 rows related to the employee_day_transmitters selected in the table employee_days_transmitters?
```
query_text = '''WITH employee_days_transmitters AS (
SELECT DISTINCT
employeeId
, theDate
, transmitterId
, employeeId || '-' || CAST(theDate AS STRING) || '-' || transmitterId AS employee_day_transmitter
FROM
table1
WHERE variable='rpv'
ORDER BY
RANDOM()
LIMIT
800
)
SELECT
*
FROM
table1
WHERE
(employeeId || '-' || CAST(theDate AS STRING) || '-' || transmitterId) IN (SELECT employee_day_transmitter FROM employee_days_transmitters) AND variable = 'rpv'
'''
table2=pandasql.sqldf(query_text,globals())
```