Do you know about the subcontracting and through-zero copy technology based on Protobuf shared fields?

This article introduces the implementation of Protobuf shared field Guard and applies it to the central control/recall scenario, and obtains significant CPU/latency benefits. Even without guard, I hope that the experience and ideas of this article will bring some help and reference to the reader.

introduction

In a recommendation system, user-level fields often need to run through the entire link, for example, experimental parameters, behavioral sequences, user portraits, and so on.

Modules such as recall/filtering/sorting require user characteristics, and the best way to do this is naturally to take them all at once at the beginning of the request and then pass them on all the way. Previously, the author’s writing style was often:

const GetRecommendReq & oReq;
RankReq oRankReq;
oRankReq.mutable_user_portrait()->CopyFrom(oReq.user_portrait());

Such a transparent transmission naturally has advantages, for example, if the downstream needs user characteristics, it does not need to be requested once per request. Especially when the upstream initiates subcontracting, the user-level feature can significantly reduce the RPC overhead of obtaining the user feature downstream.

However, the RPC overhead has been reduced, and then long Wangshu has to think about it, can it directly save the cost of this CopyFrom?

As we know, protobuf provides the Alliced/Release family of interfaces that eliminate the overhead of Copy or Swap by directly transferring pointer ownership.

On the other hand, instead of transferring pointer ownership, but lending out pointer ownership, you can implement shared fields. The so-called borrowing is actually to transfer the field pointer before use, but immediately after the end of use (take back the ownership to prevent being delete). And that’s the classic Guard abstraction.

Of course, even without guard, I believe that the above idea is enough to provide some help. We can implement it directly using the interface of pb:

const GetRecommendReq & oReq;
GetRecommendReq & oMutableReq =  const_cast<GetRecommendReq &>(oReq);
RankReq oRankReq;
oRankReq.set_allocated_user_portrait(oMutableReq.mutable_user_portrait());
Client.Rank(oRankReq);
oRankReq.release_user_portrait();

For some more complex operations, such as I want to copy part fields, share part fields, modify part fields (subcontracted scenarios), we give our solution below.

devise

Our Guardian provides two interfaces, Attach and Detach, as follows. The implementation is implemented through the reflection mechanism of pb, so that the release and set_allocated can be bound to each other, and the rollback is achieved when Guard is destroyed.

void AttachField(Message* pMessage, int iFieldId, Message* pFieldValue);
 Message* DetachField(Message* pMessage, int iFieldId);
  • AttachField: Lend the field set_allocted to pMesage first, guard the destruction and then roll back the release to prevent double delete.
  • DetachField: The pMessage field is loaned out first, and then the guard is destroyed and rolled back to prevent memory leaks.

The order of rollback is FILO, i.e. strictly in reverse order (since the release and set_allocated are not strictly symmetrical, which can be problematic if in a loop).

Since the construction and destruction of C++ is also FILO(
https://isocpp.org/wiki/faq/dtors#order-dtors-for-locals), it is necessary to initialize Guard after pb initialization.

These two interfaces are sufficient for several abstractions that exist in our business:

(1) Main tune through transmission /subcontracting

Pass a field upstream and zero copy the incoming downstream request. At this time, you can directly access the Attach field.


        const AReq & oAReq;
        BReq oBReq;
        SharePbFieldGuard guard;
        guard.AttachField(&oBReq, BReq::BigFieldId, const_cast<AReq &>(oAReq).mu

(2) Transferred and subcontracted

Controls that some fields are different, while others share/are the same. In order to avoid copying large fields, we can release these heavy fields before copying; After the copy is complete, the heavy fields are shared with all the subcontractors. The advantage of using CopyFrom is that we don’t need to manually judge all the new fields, we just need to treat the heavy fields in a special way.


        Req & oReq;
        std::vector<Req> vecMultiReq(n);
        SharePbFieldGuard guard;
        auto* pField = guard.DetachField(&oReq, Req::BigFieldId);
        for(auto && oSingleReq: multiReq)
        {
            oSingleReq.CopyFrom(oReq);
            oSingleReq.set_field(...);
            guard.AttachField(&oSingleReq, Req::BigFieldId, pField);
        }

(3) Multi-field shared writing (the following is a piece of desensitized actual code)

Since the pointers to the operation are of the Message* type, the mapping of the pb index to the field pointer can be stored directly in the container. All heavy fields can be shared by looping.

         std::vector<uint32_t> vecHeavyField{};
        SharePbFieldGuard oGuard;
        std::unordered_map<uint32_t, ::google::protobuf::Message*> mapIndex2Message;
        for(auto uField: vecHeavyField)
        {
            mapIndex2Message[uField] = oGuard.DetachField(&oReq, uField);
        }
        
        for (auto && oSingleReq: vecReq)
        {
            oSingleReq.CopyFrom(oReq);
            
            for(auto uField: vecHeavyField)
            {
                oGuard.AttachField(&oSingleRecallReq, uField, mapIndex2Message[uField]);
            }
        }

Be the first to comment

Leave a Reply

Your email address will not be published.


*