Building an advanced sensor to monitor and dissect network messages is no easy task, especially when dealing with modern, complex cloud services.
At Attribute we take special pride in our Sensor. It enables us to see into the traffic and gain insights about cloud resources usage and applicative flows, so we make sure we constantly invest and develop Attributes sensor capabilities.
In this blog post we will share what we learned while adding new dissecting capabilities to support Google Pub/Sub.
The Challenge: Dissecting Google Pub/Sub Protocol
Our challenge was clear: How do we add support for dissecting messages from Google Cloud Pub/Sub service, which operate over the gRPC protocol, while also covering other GCP services and Python-based gRPC communications?
At the core of this challenge was PubSub, Google’s messaging queue service, which communicates using gRPC—an HTTP/2-based protocol that serializes messages in Protobuf. We needed to understand how to capture and decode these messages at a granular level, all while dealing with encrypted and compressed data traveling across the network.
The First Step: Setting Up the PubSub Environment
We began by creating a controlled environment in Google Cloud to generate PubSub messages. This involved setting up a PubSub topic, creating a subscription channel, and crafting Python scripts to simulate real-world publishing and subscribing events.
Then we wrote a simple publisher script to send messages into the channel and an additional script that acts as a subscriber server that waits for messages from the created subscription channel.
Here is a partial section of the publisher script logic:
from google.cloud import pubsub_v1
from google.oauth2 import service_account
def test_pubsub_v1_publisher_client():
# Create a Pub/Sub publisher client with the credentials
credentials = service_account.Credentials.from_service_account_file(
TEST_CONFIG["service_account_json_file_path"]
)
publisher = pubsub_v1.PublisherClient(credentials=credentials)
# Publish the message
message_data = "Hello, Pub/Sub!"
future = publisher.publish(
topic=TEST_CONFIG["topic_path"], data=message_data.encode("utf-8")
)
logger.info(f"Published message: {future.result()}")
With everything set up, the goal was for our sensor to catch the PubSub messages as they traveled across the network. However, the sensor didn’t detect any network traffic or SSL library activity. This initial failure pushed us to investigate the network stack in greater detail.
Digging Deeper: Understanding the PubSub Network Stack
After some research, we found that Google PubSub communicates via gRPC, which uses HTTP/2 as the transport layer and Protobuf for message serialization. The `grpcio` library, which underpins gRPC in Python, adds further complexity by embedding parts of OpenSSL within its own custom Cython modules for encryption and transport security.
We traced the internal workings of the `grpcio` library, eventually identifying its use of a C-Extension module (`cygrpc.cpython-x86_64-linux-gnu.so`) responsible for handling encrypted gRPC messages. This discovery was key: we now had a target for our sensor—intercepting these low-level gRPC functions to capture PubSub messages.
The Breakthrough: Intercepting gRPC Messages
The breakthrough came when we focused on the Transport Security Interface (TSI) within `grpcio`. This module handles the encryption and decryption of gRPC messages, and we realized that by placing probes on specific TSI functions, we could capture the encrypted data as it was processed.
By installing eBPF (extended Berkeley Packet Filter) probes on critical TSI functions like `protect` and `unprotect`, we were able to intercept raw HTTP/2 frames—essentially, the gRPC messages wrapped inside Protobuf.
Following is the tsi_frame_protector_vtable” struct, which is used for encryption / decryption of plaintext / ciphertext internally in the library:
struct tsi_frame_protector_vtable {
tsi_result (*protect)(tsi_frame_protector* self,
const unsigned char* unprotected_bytes,
size_t* unprotected_bytes_size,
unsigned char* protected_output_frames,
size_t* protected_output_frames_size);
tsi_result (*protect_flush)(tsi_frame_protector* self,
unsigned char* protected_output_frames,
size_t* protected_output_frames_size,
size_t* still_pending_size);
tsi_result (*unprotect)(tsi_frame_protector* self,
const unsigned char* protected_frames_bytes,
size_t* protected_frames_bytes_size,
unsigned char* unprotected_bytes,
size_t* unprotected_bytes_size);
void (*destroy)(tsi_frame_protector* self);
};
Decoding the PubSub Messages
Once we captured the raw HTTP/2 frames, the next step was decoding them. This involved identifying the gRPC headers, which are prefixed with a length field and a compression flag, followed by the Protobuf-encoded message. We extended our HTTP/2 decoder to handle this.
Here’s a simplified breakdown of the process:
1. **Intercept gRPC traffic**: Our eBPF probe catches the plaintext data from the gRPC TSI functions.
2. **Decode HTTP/2 frames**: We extract gRPC message headers and their accompanying Protobuf messages.
3. **Parse Protobuf data**: Using the PubSub protocol, we decode messages like `PublishRequest` and `AcknowledgeRequest`.
This allowed us to dissect Google PubSub messages in real-time, making our sensor a powerful tool for monitoring network traffic in GCP environments.
Extending Support to Other gRPC Versions
One of our final hurdles was ensuring compatibility with multiple versions of the Python `grpcio` library. We implemented logic to dynamically locate and analyze the symbols used in various `grpcio` releases. This required diving into binary analysis, where we leveraged methods of Parsing debug symbols from DWARF, Binary analysis (using among others Zanalysis framework and Ghidra pre-script, post-script etc.), automated reverse engineering through scripting, and symbol extraction to automate the process of identifying relevant functions in each library version.
The Future: Scaling Beyond Python
While our primary focus was on Python, the gRPC framework is used across many programming languages, including C++, Java, and Go. By successfully dissecting Python-based gRPC messages, we have laid the groundwork for expanding this support to other languages with relatively minimal effort. Each new language will require further research into its gRPC implementation, but the core principles remain the same.
Conclusion: A New Level of Visibility
Through careful research, trial and error, and a deep understanding of the Google PubSub and gRPC stack, we were able to build a robust system for intercepting and dissecting PubSub messages. This solution not only enhances visibility into GCP services but also sets the stage for broader gRPC support across different languages and environments.
With these capabilities in place, we’re better equipped to monitor, troubleshoot, and optimize the performance of cloud services using gRPC and PubSub. And as we continue to extend support for additional services, the possibilities for this sensor are only growing.
We hope this blog helps you on your journey of dissecting gRPC based protocols. If you are looking for more information, please don’t hesitate to ping us!
And here is recommended additional reading:
https://grpc.io/docs/what-is-grpc/core-concepts/
https://grpc.io/blog/grpc-stacks
https://grpc.io/docs/guides/interceptors/