Is it possible to make connections with Python without using the language packages?

Asked

Viewed 400 times

2

I use several modules to connect to Python sites, such as socket and the urllib, but without the modules, how would this connection be made? I am aware that this is a complex procedure involving other programming languages, but I am curious about how the programming language interacts directly with the internet.

  • Rewriting all the code of the modules from scratch, including code in C. Why you need this?

  • It’s more about the curiosity of how things work, you know of some link explaining this or the code itself so I can try to understand?

  • 1

    You can start by studying the code of the modules you already use. You should have it on Github.

  • I tried to do this but the modules I use use other modules to make the connection, know of some module that does not use?

1 answer

5


"Without the modules, "in Python, you can use all the functions that are built in (builtin): in the case of Python this means that you can still communicate with the world using print, input, and even open and write to any file with open.

This is much more than can be done in some languages. The modules that accompany the standard Python library are an integral part of the language, and can always be considered available. And in this respect, Python is historically one of the languages best served for dozens of use cases. The only exception would be if you wanted to run your program in an environment in which disk space is very restricted, and in this case, want to customize a Python with only part of the standard library: this customization would take, in normal cases, weeks to be done in a way to ensure that you will not break other language functionalities - and even then, it would be worth thinking about using an alternative implementation like micro-python.

In other words: for all intents and purposes, you nay communicate with sockets, or HTTP networking without the standard library modules. And that’s the normal in all languages - even in C you have to include the <stdio.h> to use printf and open.

In fact, unlike a few years ago, with improved package management processes (which is far from perfect yet), it is considered that installing a third package - from outside the standard library - should not be a barrier. This implies for example that it is so practical to make use of the external library requests, that simplifies in an order of magnitude the use of HTTP by your program, when directly using urllib and http.client. , which accompany Python.

And how the modules do it?

Regardless of the programming language, the only thing that can give communication resources with the network, or with the screen (the pixels, not the terminal), or other peripherals such as webcams, microphone, etc... is the operating system. The Kernel will provide specific calls that will build "sockets": a data structure native to memory, which is managed when new data arrive, by the very kernel of the S.O. The operating systems themselves have a layer of fundamental libraries, which bridge these kernel calls - that form, any program that can make a call in native code, following the call conventions of these libraries, you can use all the services that the system kernel exposes (and for which the current user has permissions).

For some peripherals, even in Windows, the peripheral is exposed as a pseudo file. This is the case of old printers, for example - in Windows they were exposed with the filename "prn". On Linux, "/dev/lp0" or some other name - writing data on these files could send the bytes directly to the printer. Serial ports, even those created by emulation with USB devices still work that way. That is, for this class of peripherals, just reading and writing in the corresponding file can establish communication with them - without the language or the programmer having to know more details of the low-level library I mentioned above.

And back to this library: it has the functions that appear in the C language in the headers stdio.h and stdlib.h, among others. That is, even to read and write files, the story is the same: your program calls these low-level libraries (the open functions, read and write, etc... in them), that bridge with the kernel to effectively perform operations.

In Python, the modules we already have bridge the gap with this C library. This is done by different techniques: modules can be written themselves in C - and compiled for Python modules - this is the fundamental way to expose calls in native Python code: You write a code in C that includes the header "Python. h", and fills some data structures to tell which of its C functions will be visible from Python, and which parameters accept, what they return, etc... The documentation for this is here:
https://docs.python.org/3/extending/building.html . And that’s how standard library modules work. In other languages, there has to be something equivalent - after all, everyone has to make the calls by "glibc" (in the case of GNU/Linux, but not Linux running as the Android kernel, for example) - In Windows I believe the calls are in "system32.dll". This, as you can see, is quite laborious, but it is the "how it is done". The complete documentation of glibc is here: https://www.gnu.org/software/libc/manual/html_mono/libc.html (including all concepts and documentation on how to create sockets)

In Python there are other ways to call the functions directly in C, including the functions of glibc - which can create sockets, read and write directly in files, etc... One of them is the "ctypes" module, which allows you to configure and call direct functions there. For example, aluns years ago, someone needed to use the functionality sync, that flush all files opened to the disk, physically, from within Python. The functionality did not yet exist in Python (but was included after that in Python 3.3 in the module os, as os.sync()) - so I responded with the recipe to call the direct function in glibc:

https://stackoverflow.com/questions/15983272/does-python-have-sync/15983693#15983693

>>> import ctypes
>>> libc = ctypes.CDLL("libc.so.6")
>>> libc.sync()
0

In this case it is quite simple because the Sync function does not receive or return any parameter. For things like sockets, etc... the kernel returns complex data structures, and all of this has to be coded using ctypes, so you can use these functions.

In addition to writing directly with cpython api and using ctypes, there are several other ways - from automatic link generators to native code libraries, such as swig and boost, to "cython" that compiles direct Python code for native code, and can call C functions.

And that’s the story of how modules "talk" to the system.

For things like urllib, it’s no secret: the low-level functionality is sockets - to build and use the http protocol on top of that, you can do it with pure Python, acting on the incoming data and sending it, just check the specifications. Some modules will do this in native code (usually in C) for performance reasons. just know how to use bytes, struct, bytearray and follow the specifications.

And for programs that appear on the screen, in your own window?

For objects like creating windows, for example: the Kernel will expose interfaces on how to change video mode and map memory areas to screen areas - or it will expose calls to the GPU that copy memory data to a rectangle on the screen. The operating system will expose higher-level calls that have design Apis, and primarily to render text. In the case of Linux, these Apis are the X11 that is changing to Wayland. In windows, you have to check the documentation -
Then realize that even so, creating a program in a window with a text area would still be well complicated. The same libraries expose direct calls to "create a window", and a text area already with some functionality - and even then these calls would be fundamentally different between Windows, Linux and Mac OS (and even between Windows 7 and Windows 10). Here are the "graphic toolkits" - like GTK+, Qt and Tkinter, which take care of the differences between the systems under the hood for us.

Minimal functionality without accessing system libraries

As in Python you have access to open and all file functionality, it would be possible, in theory, in Unix systems (including Linux, Android and Macos), to make use of sockets without importing any module. But a little research shows that it probably wouldn’t be possible to create a socket to begin with, and if it is, you won’t have direct access to socket-specific Apis, having to create workarounds for the most common situations.

For protocols other than raw sockets, you would still have to re-implement the entire protocol only with your code (for example, a significant part of HTTP 1.1), only to "not have to import the module"). It is a situation beyond irrational - equivalent to wanting to build a car in your garage, from screws, just not to have to take a taxi (or, in case, not to have to leave with your own car, already existing, given that the modules are already there)

  • I understood this question that using the modules is much more practical and follows that motto of not reinventing the wheel, it was just a matter of curiosity to know how it works

  • I’ve expanded the response well now.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.