Psycopg2 to Psycopg3 migration in pgAdmin 4

January 15, 2023

Recently the Psycopg project released a major version, Psycopg3. pgAdmin 4 is immensely dependent on psycopg2 as it is being used as a database driver and the pgAdmin team decided to migrate to the new latest version.

Initially, we thought that it would be hardly 3 to 4 weeks work as it is just a migration but our assumption was incorrect. The reason is we haven’t used psycopg2 "out of the box", we have extended many things and one of them is cursor factory. So, when I started porting, I realized the way we have done that is not supported any more, so I have started digging into the code and found out that in the next release, that was supposed to be included. So, I cloned the repo, built it on my local and used it till 3.1 official release.

Psycopg3 is completely rewritten, so it’s a vast topic to cover, but we will look at some of the areas which affected pgAdmin.

Connection

Psycopg2's name changed to psycopg.

The cursor_factory needs to be passed while creating a connection rather than creating a cursor and psycopg3 introduced row_factory as well and with the help of it, we can retrieve data from the database in more complex structures than the basic tuples.

Psycopg2

import psycopg2 import psycopg2.extras conn = psycopg2.connect(db_string) cur = conn.cursor(cursor_factory=psycopg2.extras.DictCursor) cur.execute(sql, params)

Psycopg3

import psycopg conn = psycopg.connect(db_string, cursor_factory=psycopg.AsyncCursor) cur = conn.cursor(row_factory=psycopg.rows.dict_row) cur.execute(sql, params)

Cursors

Both the versions have client side and server side cursors with some different behaviors but psycopg3 introduced Async cursor too. In psycopg2 asynchronous mode, a Psycopg Connection will rely on the caller to poll the socket file descriptor, checking if it is ready to accept data or if a query result has been transferred and is ready to be read on the client whereas in psycopg3, the psycopg Connection and Cursor both have counterparts AsyncConnection and AsyncCursor supporting an asyncio interface.

In order to use an async connection, you need to use async and await.

Psycopg 2

def wait(conn): while True: state = conn.poll() if state == psycopg2.extensions.POLL_OK: break elif state == psycopg2.extensions.POLL_WRITE: select.select([], [conn.fileno()], []) elif state == psycopg2.extensions.POLL_READ: select.select([conn.fileno()], [], []) else: raise psycopg2.OperationalError("poll() returned %s" % state) aconn = psycopg2.connect(database='test', async=1) wait(aconn) acurs = aconn.cursor() acurs.execute(...) wait(aconn)

Psycopg3

async with await psycopg.AsyncConnection.connect() as aconn: async with aconn.cursor() as cur: await cur.execute(...)

Data Adaption

Basically data adaption is the conversion of Python objects to PostgreSQL data types and vice versa. This module is completely re-written.

To convert the Python object to Postgres representation, use psycopg.adapt.Dumper.

To convert PostgreSQL values with type OID to python objects use psycopg.adapt.Loader

Loader example:

conn.execute("SELECT 123.45").fetchone()[0] # Decimal('123.45') conn.adapters.register_loader("numeric", psycopg.types.numeric.FloatLoader) conn.execute("SELECT 123.45").fetchone()[0] # 123.45

We can also write our custom loader or dumper. In pgAdmin, we have extensively used custom loaders. The best example is for SQL-ASCII data.

We first try to decode SQL-ASCII data with raw-unicode-escape encoding, if it fails then again attempt with UTF-8 and last try with ascii.

The pgAdmin sample code for SQL-ASCII Loader

class TextLoaderpgAdmin(TextLoader): def load(self, data): postgres_encoding, python_encoding = get_encoding( self.connection.info.encoding) if postgres_encoding not in ['SQLASCII', 'SQL_ASCII']: if isinstance(data, memoryview): return bytes(data).decode(self._encoding) else: return data.decode(self._encoding) else: # SQL_ASCII Database try: if isinstance(data, memoryview): return bytes(data).decode(python_encoding) return data.decode(python_encoding) except Exception: if isinstance(data, memoryview): return bytes(data).decode('UTF-8') return data.decode('UTF-8') else: if isinstance(data, memoryview): return bytes(data).decode('ascii', errors='replace') return data.decode('ascii', errors='replace')

Multiple results returned from multiple statements

In psycopg2, if we execute multiple statements, then we get only the last query result set but in psycopg3 we can get all the results. After running the query, the first result will be readily available in the cursor and can be consumed using the usual fetch*() methods. In order to access the following results, you can use the Cursor.nextset() method.

cur_pg3.execute("SELECT 1; SELECT 2") >>> cur_pg3.fetchone() (1,) >>> cur_pg3.nextset() True >>> cur_pg3.fetchone() (2,) >>> cur_pg3.nextset() None # no more results

Some other changes

  • You cannot use IN %s with a tuple
  • In psycopg2, using the syntax with connection, only the transaction is closed, not the connection. In Psycopg 3, using with connection will close the connection at the end of the with block.
  • Add_notify_handler: Register a callable to be invoked whenever a notification is received.
  • Add_notice_handler: Register a callable to be invoked when a notice message is received.

Conclusion

As I mentioned earlier, psycopg3 is completely re-written, so it’s a vast topic, but I have covered some of the topics which pgAdmin widely used. Psycopg3 uses more modern Python and PostgreSQL features.

The Migration Reference can be found at: https://www.psycopg.org/psycopg3/docs/basic/from_pg2.html

 

Share this