Meta patent introduces video calling system for sharing AR scenes
CheckCitation/SourcePlease click:XR Navigation Network
(XR Navigation Network November 29, 2023As mentioned above, compared with traditional video call systems, shared AR scene video call systems have a series of advantages. For example, a shared AR scene video call system can establish and implement dynamic and flexible video calls between multiple participant devices, and include participants as AR effects in the shared AR scene environment. Simply put, the shared AR scene video call system allows participants to launch an AR scene and render video textures depicting the video call participants.
在名为“Generating shared augmented reality scenes utilizing video textures from video streams of video call participants”的专利申请中,MetaThis paper introduces a method to generate shared augmented reality scenes using video textures in the video stream of video call participants.
In one embodiment, the system described in the patent establishes a video call flow to enable a shared AR scene applied by a video texture that depicts the participants of the video call as AR effects during the video call. For example, a shared AR scene video call system establishes a video call streaming channel for video calls between client devices.
In fact, the shared AR scene video call system enables the client device making a video call to transmit video data and video processing data through the video call streaming channel to facilitate the client device being a video call participant in the AR scene in the video call. Renders video textures.
For example, the shared AR scene video call system enables the client device to utilize video data and corresponding video processing data to render the participant's video as a video texture in the AR scene AR effect displayed on the client device's video call interface.
In addition, when an AR scene is launched during a video call, the shared AR scene video call system enables the client device participating in the video call to present the AR scene in the video call interface depicting a 3D (or 2D) graphical scene, instead of only on the client Captured video is presented between devices.
At the same time, the shared AR scene video call system enables the client device to receive video processing data from other participant client devices, and utilize the video processing data to render the participant's video into a video texture in the AR effect in the AR scene. In fact, the shared AR scene video call system can enable client devices to present the video call as an AR scene, with the participants of the video call as video textures depicted as being in the AR scene, rather than simply rendering the capture between client devices. video.
After receiving the video processing data, the shared AR scene video calling system may allow the client device to utilize the video data and video processing data of the participant's client device to render the individual participant's video as a video texture in the AR effect.
In one or more embodiments, a shared AR scene video call system enables a client device to render video textures for each participant such that multiple participants captured on the same client device are rendered within the AR scene of the video call for separate video textures.
As mentioned above, compared with traditional video call systems, shared AR scene video call systems have a series of advantages. For example, a shared AR scene video call system can establish and implement dynamic and flexible video calls between multiple participant devices, and include participants as AR effects in the shared AR scene environment. Simply put, the shared AR scene video call system allows participants to launch an AR scene and render video textures depicting the video call participants.
In addition to improving the functionality of traditional video calls by rendering participants as video textures in the AR scene, the shared AR scene video calling system is also able to accurately render participants captured on other client devices as video textures in the AR scene. , without access to raw video data and raw video processing data tracking from the client device.
In particular, by establishing a dedicated video processing data channel to transmit the video processing data of the video stream across client devices during the video call, the shared AR scene video call system enables client devices to accurately identify from videos captured by other client devices Participant areas of interest, such as face, hair, and eyes. In turn, the shared AR scene video call system enables client devices to render video textures that accurately combine areas of interest with AR effects.
In fact, the shared AR scene video call system enables client devices participating in the video call to establish video tracking of multiple participants in the video call without the need for each client device to process the original video of relevant information for each participant. Therefore, the shared AR scene video calling system can effectively utilize computing resources.
Figure 1 illustrates an environment 100 implementing a shared augmented reality scene video calling system 106. As shown in Figure 1, environment 100 includes server device 102, network 108, and client devices 110a-110n.
Server device 102, network 108, and client devices 110a-110n may be communicatively coupled to one another, either directly or indirectly. Environment 100 includes server devices 102 that generate, store, receive, and/or transmit digital data, including digital data related to video data and video processing data for video calls between client devices.
Server device 102 includes network system 104 . In particular, network system 104 may provide a digital platform. Network system 104 may provide messaging functionality, chat functionality, and/or video calling functionality through which a user may communicate with one or more co-users.
In other embodiments, the network system 104 may include another type of system, including an email system, a video call system, a search engine system, an e-commerce system, a banking system, and so on.
As shown in FIG. 1 , the server device 102 includes a shared AR scene video calling system 106 . In one or more embodiments, the shared AR scene video call system 106 establishes a video call streaming channel between client devices to enable video calls between client devices and enables the client devices to stream participants' videos Rendered as a video texture within an AR effect in a shared AR scene.
In effect, the shared AR scene video call system 106 establishes a video call, allowing the client device to present a video call interface and display an AR scene with video call participants as AR elements in the AR scene.
As described above, the shared AR scene video call system 106 may allow a client device to render a shared AR scene with video textures depicting video call participants as AR effects during a video call.
Figure 2 shows a shared AR scene video call system 106. The system enables a client device participating in a video call to initiate a shared AR scene video call with a participant of the video call as an AR effect within the shared AR scene. For example, the shared AR scene video calling system 106 enables the client device 202 to display a video calling interface 204 for a video call between participants.
Additionally, client device 202 displays selectable option 206 to initiate a shared AR scene during the video call. After detecting selection of selectable option 206, client device 202 displays selectable AR scene options 207.
Client device 202 receives a selection of a specific AR scene option 208 from selectable AR scene options 207 . The client device 202 then utilizes the video data and video processing data to render the participant's video into a video texture 212, 214 in the AR scene environment 210 within the AR effects 216, 218.
As mentioned above, the shared AR scene video call system 106 can enable client devices participating in the video call to transmit video data and video processing data, thereby presenting a shared AR scene during the video call, in which the participant's video can be used as an AR effect. Video texture.
Figure 3 shows a shared AR scene video call system 106. The shared AR scene video call system 106 establishes a video call streaming channel to enable the client device to transmit video data and video processing data during the video call.
In fact, FIG. 3 shows the shared AR scene video call system 106, and the shared AR scene video call system 106 enables the client device to transmit video data and video processing data through the video call stream channel, so that the client device can A shared AR scene with video textures describing video call participants is rendered as an AR effect during the call.
As shown in Figure 3, the shared AR scene video call system 106 establishes a video call stream channel 302, where the stream channel includes a video data channel 304 and an audio data channel 310. The shared AR scene video call system 106 establishes a video call flow between client device 316 and client device 318 .
Specifically, the shared AR scene video call system 106 establishes a video call stream channel 302 with a video data channel 304 to implement video communication between the client device 316 and the client device 318 . Client devices 316 , 318 may transmit video data 306 and video processing data 308 over video data channel 304 .
In one embodiment, client device 316 and client device 318 also transmit audio data over audio data channel 310 from audio captured on the respective client devices. Video call stream channel 302 may also include video processing data channel 312.
Referring to Figure 3, the client device 316 and the client device 318 each capture video of a participant using the client device. Additionally, client devices 316, 318 may also recognize video processing data for the captured video. The segmented video frames are then passed through the video call streaming channel 302 at the client devices 316, 318.
Client devices 316, 318 receive the combination of video data 306 and video processing data 308 and utilize the data to render the shared AR scene during the video call. As shown in Figure 3, client devices 316, 318 utilize video data depicting video call participants to render video textures in AR effects. The client device 316, 318 then provides a display of the video texture-based AR effect within the video call interface.
In one embodiment, the shared AR scene video call system 106 establishes a video processing data channel to facilitate the transmission of video processing data within the video processing data channel using one or more data exchange formats.
In one embodiment, the shared AR scene video call system 106 may utilize a video or image communication channel as a video processing data channel. For example, the shared AR scene video call system 106 may establish a data channel that facilitates transmission of video, video frames, and/or images as a video processing data channel.
In one embodiment, the shared AR scene video call system 106 may enable the client device to render video textures from the captured video and transmit the video textures over a video processing data channel. For example, during a video call, the client device can receive video textures from the participant client device through the video processing data channel and utilize the video textures in AR effects.
In one embodiment, the shared AR scene video call system 106 may enable client devices to utilize machine learning models to receive and process video textures from participant client devices during a video call through a video processing data channel.
In one embodiment, the shared AR scene video call system 106 can enable shared AR scene video calls between multiple client devices. For example, the shared AR scene video call system 106 may establish a video call streaming channel 302 between client device 316, client device 318, and one or more client devices 314.
Indeed, while transmitting video processing data over video call stream channel 302, one or more of client device 316, client device 318, and client device 314 may render the participant's video to multiple client devices. The video texture displayed within the video call interface.
As mentioned above, the shared AR scene video call system 106 can enable the client device to render the participant's video as a video texture in the AR effect in the AR scene during the video call.
FIG. 4 shows a flow chart in which the shared AR scene video call system 106 establishes a shared AR scene video call between client devices. As shown in Figure 4, the shared AR scene video call system 106 can enable the client device to transmit video data and video processing data during the video call, thereby rendering the participant's video as a video texture within the AR effect in the AR scene.
At 402, the shared AR scene video call system 106 receives a request from client device 1 to conduct a video call with client device 2.
At 404, the shared AR scene video call system 106 establishes a shared AR scene video call between client device 1 and client device 2.
At 406, client device 1 sends the first video stream to client device 2 through the video data channel and the audio data channel. At 408, client device 2 sends the second video stream to client device 1 through the video data channel and the audio data channel.
At 410, client device 1 presents the first and second video streams. At 412, client device 2 also presents the first and second video streams.
At 414, Client Device 1 initiates a shared AR scene during a video call with Client Device 2. At 416, client device 1 transmits AR scene data to client device 2 through the video call flow. At 418, client device 2 simultaneously transmits AR scene data to client device 1 through the video call stream.
At 422, client device 1 renders the first video stream into a first video texture within a first AR effect in the AR scene using the first video stream and the video processing data for the first video stream. Additionally at 424, client device 1 renders the second video stream into a second video texture in a second AR effect using the video data and video processing data received from client device 2.
By doing so, Client 1 can render the video call as an AR scene, where the participants of the video call can be depicted as video textures within the AR scene, rather than simply rendering the captured video between client devices.
As shown at 426, client device 2 renders the first video stream into a first video texture in a first AR effect using the video data and video processing data received from client device 1. At 428, client device 2 further utilizes the second video stream and the video processing data for the second video stream to render the second video stream into a second video texture within the first AR effect in the AR scene.
Therefore, Client 2 can likewise present the video call as an AR scene, with the participants of the video call as video textures depicted as being in the AR scene.
As described above, the shared AR scene video call system 106 can enable client devices to utilize various types of video processing data to render video textures of participants within the AR scene during the video call.
Figure 5 illustrates a client device receiving and utilizing video processing data from another participant device to render a video texture of a participant in an AR scene. Specifically, Figure 5 shows a client device utilizing an AR engine. The engine utilizes video data and video processing data to present video textures of participants within the AR scene in a self-view display during the video call.
The shared AR scene video call system 106 utilizes the video call stream channel 504 to establish communication between the video call participant device 502 and the client device 514 during the video call.
As shown in FIG. 5 , video call participating device 502 transmits video data and video processing data through video call stream channel 504 . More specifically, video call participant device 502 transmits video data and video processing data as combined data.
Video processing data may include participant metadata. The shared AR scene video call system 106 enables client devices to utilize participant metadata to identify participants in order to include the participants as video textures in the AR scene, determine the type of AR effects to be used for participants in the AR scene, and/or present other information corresponding to the participant. For example, attendee metadata may include attendee identification, attendee status, and/or video call information.
In one embodiment, the participant metadata includes an AR identifier that indicates the type of AR effect or a specific AR effect to be used for the participant. In one embodiment, the video processing data may include face tracking data. For example, the shared AR scene video call system 106 may enable client devices to recognize, track, and indicate facial tracking information from captured videos.
名为“Generating shared augmented reality scenes utilizing video textures from video streams of video call participants”的Meta专利申请最初在2022年5月提交,并在日前由美国专利商标局公布。