Peking University Intelligent Graphics Primary Explore: Form and Power Concerto, Integration of Knowledge and Data


Author 丨 Mu

Edit 丨 Cenfeng

The Yuan universe is considered to be the natural iteration stage of the Internet. It is the moment when human society invented language, text, mathematics, and images, the information explosion forced us to constantly abstract data into high -dimensional data. revolution. There is a saying that “culture is the Yuan universe”. The world of the Yuan universe stems from reality, and it is different from reality and transcending reality. We can easily communicate face -to -face in the physical distance and deeper the meaning of reality, which is beyond the rules. But before we surpass the rules, we are still immature in the first step.

Nowadays, countless scholars are exploring the first step of the Yuan universe, that is, restoring reality. In the field of vision, they study how to obtain the three -dimensional shape of the city’s high -rise buildings, how to simulate the contact between cherry and water, and how to make the wrong people learn to walk and dance.

By studying the geometry and behavior of the three -dimensional object, this field -computer graphics is showing its unlimited potential to restore the world.

In the exchanges with Chen Baoquan, Wang Bin, and Liu Libin, we can feel that although the Yuan universe is still out of reach, “seeds have already germinated.” Chen Baoquan mainly studied geometry, that is, three -dimensional modeling, Wang Bin and Liu Libin’s main research behavior, that is, physical simulation and motion control.

Geometry and behavior are the research directions for the development of the intelligent graphics team of Peking University. The two constitute the dual play of “shape” and “force”.

1 3D model building

As Eriko Fermi said: if you can not create it, you cannot understand it.


“Graphics is also a must -have stage for people to explore and understand this world. At the same time, rebuilding the world is also a concept that graphics has been advocating. In many years of accumulation, graphics has accumulated a lot of knowledge about the world. For example, the geometrics of objects. Personal expression, physical characteristics, light, and so on. To achieve visual intelligence, graphics is a very important step. “Chen Baoquan said.

Chen Baoquan, Peking University Boya specially appointed professor. The research field is computer graphics, three -dimensional vision and visualization. In 2017, he was elected as the Chinese Computer Society of Computer Society, was elected IEEE FELLOW in 2020, and was selected as IEEE Visualization Academy in 2021, and was elected as a convention of the Chinese Image Graphics Society.

The time to restore the restore on the computer is retrospective. At the 2022 Winter Olympics, Chen Baoquan showed us this magic. When viewing the ice hockey competition with mobile phones, the game can be suspended at any time, and the ice court can be rotated at 360 degrees to taste wonderful moments.


This technology is also only a small test cattle knife that studied three -dimensional models in Chen Baoquan. Chen Baoquan has been focusing on three -dimensional modeling for real scenes since 2000. In 2009, in the project to build a city 3D modeling for Shenzhen, the Chen Baoquan team used laser scanning and other means to obtain the three -dimensional cloud of real scenes, and then rebuilt. This technology has become the basis for the construction of smart cities.

In 2008, the first session of the “Urban Construction Improp Industry and Visualization” series of international forums created by Chen Baoquan returned to the meeting at home and abroad.

In 2009, Chen Baoquan created a large -scale urban market scenery three -dimensional reconstruction team based on mobile car laser scans.

Due to the outdoor environment restrictions, such as the obstruction of the trees, it is impossible to obtain the point cloud data on each side of the building. Therefore, the Chen Baoquan team proposed a method of combining the ancestral knowledge. By identifying the plane area from the sparse cloud, the cross -line and intersection between the planes were calculated to obtain a complete polygon. Among them, the plane area was obtained through clustering. The figure below gives a sparse three -dimensional point cloud, a point cloud after clustering, and the reconstructed three -dimensional model.

Three -dimensional reconstruction of sparse clouds. Picture source: large -scale urban market Jingjian model and understanding

Based on the advantages of two-dimensional images and three-dimensional point cloud, the Chen Baoquan team put forward a method of fusion of two-dimensional images and three-dimensional cloud layered building wall reconstruction in the thesis “2D-3D Fusion for Layer Decomposition of Urban Facades”. By giving the depth information of the three -dimensional cloud to two -dimensional images, a high -resolution, noise -free building model is restored. The figure below gives a three -dimensional point cloud and two -dimensional image, the registered cloud and image, the three -dimensional model of the reconstructed building, and the model after paste texture.

Three -dimensional reconstruction of the fusion cloud and image building. Picture source: large -scale urban market Jingjian model and understanding

Architecture and plants are the two most common types of entities in cities. Its three -dimensional model is also the main composition of the city’s three -dimensional scenes. Different from regular artificial buildings, plants are natural products, and the characteristics of three -dimensional structure are more complicated. Although it can also be used to model plants, it is basically difficult to describe a given model or real tree. Based on the actual collection of data (generally image and point cloud), a low -level model description can be obtained, such as the triangle grid model.

The Chen Baoquan team put forward the laser -point automatic tree skeleton reconstruction method based on the laser -point cloud skeleton reconstruction in the thesis “Automatic Reconstruction of Tree Skeletal Structures From Point Clouds”. Trees’ skeleton structure. This algorithm does not need to divide the point cloud to rebuild the branches structure of each other.

Automatic tree skeleton reconstruction based on laser point cloud. Picture source: large -scale urban market Jingjian model and understanding

After realizing the similarity of the existence of the local structure of the same tree, the team proposed the papers “Texture-Lobes for Tree Modelling”, which was made based on the efficiency limitations of the aforementioned method.


Based on Lobe’s three -dimensional modeling. Picture source: large -scale urban market Jingjian model and understanding

In the past ten years, in the era of rapid development of smart cities, the scale of the scene was getting larger and larger, the particle size was getting thinner, and the frequency of update became higher and higher, which became a new requirement for the three -dimensional modeling of smart cities.

In the case where the original collection data is sparse or even lacking, the modeling method based on prior knowledge and geometric internal rules is limited. The Chen Baoquan team proposes a “active” scanning mechanism, which will collect and rebuild a closed loop to provide reconstruction to provide reconstruction. Data guarantee. Active collection can be done by robots or people.

To this end, the Chen Baoquan team put forward the ideas of the city market. This idea believes that the scale of the city market is large and always changes in rapid changes. The cost of centralized reconstruction is expensive and its integrity and real -time update have become impossible. , To reach the instantaneous update of the city market. Smart (single/multi -robot or crowd) has the ability to take the initiative to explore, and is the main carrier of the scene progressive construction.

In the thesis “AutoScanning for Coupled Scene Reconstruction and Proactive Object Analysis”, the Chen Baoquan team proposed a single -robotal carrier explore method based on scenes’ confidence guidance to verify and improve the accuracy of the results through interaction of low settlement scenarios to verify and improve the accuracy of the results. Refined indoor scenes.

When it comes to the outdoor scene of the city, because the environment is open, it is impossible to model in advance, and directly applying the same method to cause efficiency problems. “For a changing scene, how the robot is self -navigation and scene exploration is also a difficult problem. After all, not only the movement of robots, but also the interaction of robots and the environment.” Chen Baoquan said.

To this end, in the thesis “Autonomous ReconStruction of Unknown Indoor Scenes Guided by Time-Varying Tensor Fields”, the Chen Baoquan team proposed an unknown indoor scene automatic reconstruction strategy of time-changing field drive. The city market scene objects are constrained and updated, and the robot path guides it to explore, thereby taking into account efficiency and accuracy.

The work efficiency of a robot has always been limited, so multi -robot collaborative exploration has become a natural choice. “The difficulty of robotic collaboration is whether the N robots can achieve an efficiency of N times. We even hope to achieve the effect of 1+1> 2. It is the key to collaboration. “Chen Baoquan said.


In the thesis “Multi-Robot Collaborative Dense Scenes Reconstruction”, the Chen Baoquan team proposed a multi-robot collaborative exploration based on the optimal quality transmission theory and the active gradual reconstruction of the city market view model. The goal of optimal quality transmission theory is to find the mapping relationship between two distribution (or sets), so that the maximum cost is the lowest at a given measure.


In the problem of multi -robot scanning and reconstruction, the robot is regarded as the “supplier” of the scene scanning task. The unknown environment is regarded as the “demand party” of scene scanning tasks, and the cost of the robot’s actual execution of the scanning task (such as the mobile distance ) As a mapping measurement. Based on this, the optimal quality transmission can be solved, and the maximum scanning cost can be obtained by solving the optimal quality transmission.

A multi -robot collaborative dense reconstruction algorithm for unknown indoor scenes. Picture source: MULTI-ROBOT Collaborative Dense Scene Reconstruction

“On the whole, we need to use global planning to coordinate the cooperation and task allocation between all robots, and we must also plan the task that can be completed alone based on the local perspective of the robot. “,”


The world is not a collection of static knowledge. Chen Baoquan also continues to embrace progress on the journey of scientific research. The use of priority knowledge combined with data learning has witnessed the process of geometric modeling in the scale and continuous extension of fineness. However, if it is limited to the geometric model itself, such a world is also static.

“From generating a world to understanding a world, the two are inseparable. The generation is for understanding, and after understanding, it is also to better generate. Division, but to restore the authenticity and even dynamic reactions of contact, collision with other objects in the real world.

“Geometric modeling is the basis of physical simulation. Usually we must first get the geometric parameters of the object, and then infer the physical parameters based on the dynamic changes in the geometric shape, such as the study of lotus leaf research by Teacher Wang Bin. In the same way, to control a person’s posture, you also need to obtain real human data first to learn. But in the face of natural phenomena, geometric modeling and physical simulation sometimes need to be carried out at the same time to obtain dynamic reconstruction of phenomena through global optimization. ” Chen Baoquan said.

2 physical simulation

“With external forces shake a lotus leaf, we get the dynamic data of lotus leaves. Based on this, we can not only infer the geometric shape of the leafy leaf, but also infer the physical parameters of the leaf.” Wang Bin said, “These physical parameters not only not only not only do not only Including the hardness of the materials, including damping characteristics, original shapes, etc. “

Wang Bin, currently the full -time researcher of the Beijing General Artificial Intelligence Research Institute (BIGAI). Before joining Bigai, she served as a researcher at the Future Image High Innovation Center of Beijing Film Academy from 2017 to 2021.

Dr. Wang Bin graduated from Beijing University of Aeronautics and Astronautics. During the period of research, the research direction was virtual reality and human -computer interaction. At that time, it was a very cutting -edge direction. After that, she went to UBC to conduct an access study, mainly using the simulation and simulation of the hand.

In the process of visiting research, Wang Bin gradually interested in physical simulation. Due to the high threshold of physical simulation, Wang Bin started from the collision testing subject, gradually entered the field of simulation and deeply cultivated.

Wang Bin told us that before studying physical simulation, it is necessary to have deep accumulation in mathematics and physics. It also needs a strong code to achieve the ability. To achieve the foundation, we often need to make wheels from scratch. In addition, the amount of calculation of physical simulation is large, so it is necessary to design and achieve better algorithm structure design and efficient implementation. In order to improve the calculation efficiency, some calculation work also needs to be transferred to the GPU, and right is right. There are some higher requirements for programming ability. “

In terms of mathematics, physical simulation mainly involves numerical computing and optimized mathematical theory support. “For example, in the reverse analysis algorithm, you need to optimize the algorithm foundation. In the simulation Choose a numerical calculation work such as mathematics. “Wang Bin said.

Later, Wang Bin went to Singapore National University for post -doctoral research in the field of physical simulation. Wang Bin joined the Beijing Film Academy Future Imaging High -precision Innovation Center for 5 years. Recently joined the Beijing General Artificial Intelligence Research Institute, and has cooperated with Peking University and foreign universities to conduct many research on physical simulation simulation. For example Moom, fluid simulation, magnetic substance simulation, etc.

The physical parameter inference of the lotus is a material reverse modeling research. The relevant results were published in the thesis “Deformation Capture and Modeling of Soft Objects”, which was completed by Wang Bin and Liu Libin and others.

The system can only capture and rebuild the dynamic model of soft objects from sports data. Then, this model can synthesize new movements that meet the specified constraints and respond to dynamic disturbances. Left above: A dinosaur that is walking; in the middle: a pot rack is jumping; the right of the above picture: a hanger is jumping. Below: The lotus leaf shakes in the artificial wind field. Picture source: Deformation Capture and Modeling of Soft Objects

The interaction driver of graphics can be divided into two branches, one is geometric data driver, and the other is mechanical driver. Geometric data drivers refer to a dense geometric shape of a phenomenon, and then interpret and get results through it, and the research work of lotus is based on mechanics -based drivers.

“The overall interaction is driven according to the physical model, and the key parameters of the model are solved by data -driven. For example, the hard and hard level of objects, damping coefficients, and reference shapes (natural relaxation state in the state of weight). Sports data reverse the modeling method of systematics and physical coefficients. “Wang Bin said.

After the reverse material coefficient is generated, it can also modify and customize it to migrate to other similar objects. Model data -driven model anti -disclosure can also be used to fit the super materials that do not exist in reality. “The purpose of reverse material modeling is to reduce the difference between simulation and real. When we need to control certain parameters of the model and make it have new features, the model can also be adjusted by parameter intervention.”


In terms of the design of the material model and coefficient, the AI ​​method is generally not used to express it. “Because it usually cannot meet the many priorities, it is intuitive to understand that many hard constraints cannot be satisfied. Poor sex. Deep learning is very coupling. At present, it cannot be or difficult to explain the control variables of each parameter. The reverse material modeling of a deformed object requires a good combination of data -driven and priority knowledge.

The reverse modeling of the material is usually limited to a single object, and the scene data collection of multiple object interaction is not performed, because many parameters such as contact force cannot be measured and collected. However, Wang Bin still moves towards this direction.

In the thesis “Solid-Fluid Interaction with Surface-Tension-Dominant Contory”, Wang Bin and Chen Baoquan and others cooperated with the stream solid coupling simulation under strong surface tension-the solid-flow interaction with surface tension-led contact. In this study, whether it is a steel -shaped needle, cherry, autumn leaves or water -ray robots, it can float on the water surface on the surface tension, and ripples a true and natural ripple.

The three -way coupling method can simulate the contact dynamics mainly on the surface tension between the solid and the liquid, including the static contact of the steel clot, the cherry on the water, the autumn leaves floating and rotating in the stream, and the water driven by its joints黾 Robot. Picture source: Solid-Flumid Interaction with Surface-Tension-Dominant Contory

The biggest feature of this solid surface contact is strong surface tension. For example, the density of the steel back -shaped needle is 8 times the water, but it can still float on the water surface because the surface tension coefficient of the water is higher.

For solid objects on the water, its force balance can be understood as a balance between gravity 𝑚_𝑟g, buoyancy F_𝑏, and capillary F_𝑎: 𝑚_𝑟g = F_𝑎 + F_𝑏. The role of buoyancy is inferred by integrating the volume of the volume of the water, and the capacity is calculated by integrating the surface tension of the perimeter of the volume.

Solid and fluid interaction. Under the balance between gravity, buoyancy F_𝑏, and fur power F_𝑎, solid round floats on the water. Picture source: Solid-Flumid Interaction with Surface-Tension-Dominant Contory

From the perspective of calculation, the interaction between the three forces accurately simulates the three subsystems in proper treatment -liquid, solid, and strong tension liquid interface between them.

However, in the computing physics and computer graphics community, due to the lack of effective computing tools to accurately simulate the interaction between the three subsystems, the problem of simulating strong coupling surface tension has not been explored to a large extent.


In the traditional two -way coupling system, there are no direct channels to bridge the bridge to connect the liquid and solid, which makes it impossible to simulate the vital F_𝑎 item in the flow solid system. “The fluid Euler grid usually cannot track the surface well, the surface tension and curvature are related, while the Euler grid is not easy to accurately calculate the curvature.”

To this end, Wang Bin and the team proposed a novel “three -way” coupling mechanism to simulate solid -liquid coupling driven by strong surface tension. “The key is to view the surface tension dominant interface as the same as the liquid volume and solid object coupling at the same time. The Lagrangine thin film, the interface is no longer a thin value carrier, but has limited small thickness. Laglangri method can accurately track the surface and calculate the surface tension. It is good to express the collision of the surface and objects, and apply the tension of the molecules to the solid. “

The team has developed a set of numerical infrastructure around this “three -way” coupling idea to comprehensively adapt to the treatment of non -compressed, buoyancy, surface tension, rigid joints and various complex interactions. “An important feature of our numerical solution is that it can handle the coupling between the liquid and the high -density than the solid system, which is not feasible for all previous methods.”

In addition to the floating of objects, this method can also simulate the phenomenon of “Cheerios effects” (such as oatmeal on milk) and the weakening effect of surface tension caused by surface activity ingredients (such as grid to the water). “So, through numerical solutions, we can realize the coupling of multi -scale and multi -physical fields. The basic thinking is based on the physical mechanism behind it, and the framework of the numerical calculation is described to describe it.”

The sphere falling into the water. Due to the grid of the liquid film, it can be transmitted by fine waves stimulated by solid movement. Picture source: Solid-Flumid Interaction with Surface-Tension-Dominant Contory

The lotus leaf simulation and clothelium simulation are both classic mechanics issues. In the paper “A level-set method for magnetic substance simulation”, Wang Bin and Chen Baoquan and others cooperated to challenge magnetic flow simulation issues and argued to many years in the field. A solution was released.


Is this dispute “Is the material’s magnetic power or superficial power?” Even today, this question has not been clearly answered, and the origin of the debate can be traced back to the birth of Maxwell equations 150 years ago.


In the phenomenon of surface tension, the magnetic flow shows its unique surface geometric shape and dynamic characteristics, that is, the emergence and evolution of the sharp cone structured array. These attractive features are due to the multilateral interaction between gravity, surface tension and magnetic force.

Wang Bin and the team proposed that the magnetic coupling system can be solved as the interface problem whether theoretically or calculated. “Magnetic flow is generally calculated based on the background grid. For superficial power. In our research, we did not use equal field for modeling, but used the surface force to build a model. “

Using the surface power modeling method can cleverly use the border jump to simulate the surface power. This aspect is exactly good to describe the mathematical method, so as to smoothly calculate it. Therefore, in the magnetic flow modeling, we only need to be based on the The grid can be described in a good description. “

The front -direction coupling from the magnetic field to the mechanical system is the interface. By simulating the surface effect of the hemhozhitz on the sports object (such as the fluid or solid), the back coupling from the physical system to Move magnetic materials (horizontal sets, particles or grids) immersed in the background magnetic field.

The calculation framework can easily integrate into the standard Euler fluid solution to achieve simulation and visualization of complex magnetic fields. Due to the nature of the method of Euler, it is naturally able to accurately calculate the long -distance magnetic interaction, regardless of the distance between the immersed objects. The methods they proposed to include the simulation of objects such as iron magnetic fluids, rigid magnets, deformable magnets, and multi -phase coupling.

The method based on a unified horizontal set can simulate and visualize the dynamics of various magnetic phenomena, including ferromagnetic fluids, deformation magnets, rigid magnets, and multi -physical field interactions. Photo source: a level-set method for magnetic substance simulation


As we often see in textbooks, many physical problems have very limited objects and boundary conditions, but the physical phenomenon simulated by computer graphics, whether the solid liquid coupling mentioned above or magnetic flow, often space, time, time, time and time The span of phase changes is very large, and it also involves multiple phenomena, which spans multiple boundary conditions.

“In other words, we need to achieve a large span change in a solution and changes in boundary conditions, which is very different from the solution of traditional mathematical physics.” Chen Baoquan said, “Requesting such a complex phenomenon will involve involving such complex phenomena involved in the solution. The methods of different systems must be combined together, and at the same time, there is a continuous expression in geometric expression, which is difficult to do. For example, in the simulation of solid and fluid coupling, solids have solid expression, fluid has fluid fluids In the expression, there are energy transmission between them. In other words, the solid has a equation, the fluid has an equation, and the two equations must be associated. “

The simulation challenge of the magnetic flow lies in the simulation of multiple physical fields. For example, in magnetic flow simulation, its essence is to add a magnetic field to solid simulation, and the magnetic field and solid have an interactive nature. This additional magnetic field will make the overall system more complicated, so the coupling of classic mechanics and electrical mechanics is its key. Similar challenges also exist in coupling simulation of rigid and elastic bodies.

3 exercise control

The maximum application direction of the combined modeling of elastic body and rigid body is the simulation of the human body. Previously, the human simulation work was simplified to simplify the human body into the hinge structure of the rigid body, and did not consider the impact of muscle fat on the human body. But in fact, these elastic body tissues have a great impact on sports behavior. “If our control algorithm has not obtained the effect of such muscle fat on bones, then its realism will decrease significantly. Therefore, we must consider all factors that affect exercise.” Chen Baoquan said.

In many current games, there are fewer applications such as such simulation. “The reason is that there is no need to such accurate simulation. They are pursuing more calculation efficiency and visual effects.”

弹性体与刚体的联合建模涉及到数字人的研究建模,数字人的建模难题在于如何对数字人进行全方位的描述,包括纹理、动作的复现,以及医学生理结构(比如血管、 Muscle, nerves, etc.).

In the thesis “Learning Skeletal Articulations with Neural Blend Shapes”, Liu Libin and Chen Baoquan and others proposed a new method to overcome the common deformation defects of 3D digital models in the movement. For example Phenomenon), thereby realizing high -quality skin deformation.

Traditional skin and assembly deformation models are too simplified to simplify the movement of humans and animals, leading to classic deformation defects, and using hybrid shape technology can provide fine -grained control in sensitive areas such as joints. Based on this, this work proposes a new “neural hybrid shape” technology based on artificial neural networks that can automatically handle digital models with different shapes and connectivity.


Through neural network learning, input bones and skin with arbitrarily connective human body, and generate neurotransmid shapes. This framework can generate displacement related to posture, leading to high -quality deformation, especially in the joint area. Picture source: Learning Skeletal Articulations with Neural Blend Shapes

During the training period, the network observes the deformation of the shape, and learns the use of indirect supervision to infer the corresponding binding, skin and mixed shapes, and bypass the needs of providing supervision packaging or mixed shape deformation parameters. Because the training data has a specific potential deformation model, indirect supervision can learn the mixed shape of any amount.

Envelope deformation branch. Given the grid of the T-Pose (V, F) and joint rotation (R). The neural network can infer the skin (W) and assembly (O) parameters by observing the vertex position of the joints of the character’s joints and indirect supervision. Picture source: Learning Skeletal Articulations with Neural Blend Shapes

“This work is the first automatic envelope method based on deep learning, combined with the mixed shape related to posture, which can be used for arbitrarily connecting skin grids.” Liu Libin said, “It is worth noting that our The model has a strong capture ability of human detail deformation (for example, muscle jitter). “

Chen Baoquan said, “We have now achieved one -way modeling, that is, to reproduce the movement, and then modify the shape of the muscle, rather than the corresponding movement control due to the changes in muscle. Therefore There are differences in jitter, and simulation and reality are still different. “


“Human movements are the result of a subjective process. Therefore, we usually cannot restrict the process and performance of restrictions through the established rules and regulations. It is essentially a statistical model. Studies, AI is a good solution. At present, related cutting -edge work is also a more breakthrough of AI. Among them, deep learning, strengthening learning play and playing an important role. “Liu Libin added.

Liu Libin, a assistant professor at the Peking University Frontier Computing Research Center. The main research direction is computer graphics, physical simulation, motion control, and related optimization control, machine learning, enhancement learning and other fields.

Before joining the center, Dr. Liu Libin had conducted postdoctoral research at the University of British Columbia and the Disney Research of the University of British University in Canada, and later joined Deepmotion Inc., the American Silicon Valley startup company.

Liu Libin focuses on sports control. One of the most important applications of this technology is character animation. The generation of traditional character animation involves modeling, bone binding, camera control, and action generation. The entire process requires a lot of time and manpower. Combining artificial intelligence technology, it is expected to achieve acceleration of animation generation. In fact, during the PhD, Liu Libin began to explore the learning skills of animation characters.


Different from physical simulation, there is no enough systemic knowledge in the field of character animation. Therefore, Liu Libin and the team began to try to strengthen learning methods. Studies have found that whether it is a single skill or skill combination, strengthening learning has a better effect than traditional methods.


“I think complete artificial intelligence should have good exercise capabilities. It can support smart body exploration and can complete more complicated tasks. Therefore, we hope that future artificial intelligence can actively perceive the movement and autonomously Learn new sports skills and can coordinate the use of these skills according to the actual situation, so as to interact and collaborate with people and other artificial intelligence, “Liu Libin said.

Of course, even if muscle jitter can be restored well, it is necessary to generate smooth movements with artificial intelligence, and a large amount of action data is needed. From the manual adjustment of the gesture in the key frame of the role, to the action capture technology, to the supervision gesture of deep learning, in fact, the action learning can be further -unsupervised the action.

In the thesis “Unsupervised Co-PART Segmentation Through Assembly”, Liu Libin and Wang Bin, Chen Baoquan and others proposed a common part of the image-based image-based part of the image. This method can effectively divide the human body, hand, four -foot animal, and robotic arms, and then effectively capture the action information in the video. After the information is integrated into the animated character model, the movement can be naturally generated.

The visual segmentation results tested in different scenarios include humans, hands, four -foot animals and robotic arms. Picture source: Unsupervised Co-PART Segmentation Through Assembly


The video sequence contains all the structure and movement information of the action, including the dynamic conversion of the subject at any time and the posture.

The goal of Liu Libin and the team in this study is to extract universal representations from components from the video. After obtaining the components, a free combination can be performed.


Specifically, during the training process, the image encoder converts the source image input into the potential feature chart and source component transformation. The source component transformation can replace the potential characteristics of the source characteristics to the standard characteristic diagram. The “origin” of the figure. At the same time, there is another target image as input, which is converted into a target potential feature diagram and target component transformation. The standard characteristic diagram is transformed into a regulatory feature diagram by the target component transformation. The indicator of judging the effect of network learning is to decode the positioning feature diagram as the reduction of the target image, and the reduction of the potential characteristics of the source into the source image.

The training process is trained to divide the network in a end -to -end manner. Picture source: Unsupervised Co-PART Segmentation Through Assembly


Because it is not twisted by global images but mixed with twist images of each part to generate the final image. In essence, image -based assembly operations effectively restrict the formation of each separate parts, thereby improving the final result.

Compared with the division of a single image, the self -supervised learning mode aggregates the shape related information from multiple images, thereby improving the division of a single image.


In movie and other scenes, camera lens is also an important part of narrative. It is a way of thinking based on photography’s priority knowledge to generate camera trajectories, but it is difficult to express this priority knowledge in mathematical language. To this end, in the thesis “Example-Driven Virtual CineMatogram by Learning Camera BehaviViors”, Wang Bin and Chen Baoquan and others have proposed the method of extracting the camera style from the input video to make the process of shooting the virtual animation scene show a similar style.


The design of a camera motion controller, which can automatically extract camera behavior (left) from different movie clips and re -applies these behaviors to 3D animation (middle). In this example, the model automatically generate three different camera trajectories (red, blue and yellow curves) from three different reference editing. On the right, the four specific moments of the track of each camera show the ability to encode and reproduce the camera behavior from different input examples. Source: Example-Driven Virtual CineMatography by Learning Camera Behavits


Wang Bin said that the proportion of artificial intelligence in this work is large because it is different from physical simulation. “There is a rich and solid formal knowledge behind the physical simulation, and there is no need to repeat the wheels for AI. For lens language, its semantic nature is strong, and there is no appropriate mathematical model for description. This is the advantage of the neural network. It is more suitable for modeling and description of things with strong semantics. “

“In the generation of exercise, there are currently no semantic characteristics.” Liu Libin added, “There will be similar work and elements in the style of style, such as the semantic expression variables that represent cheerful or sad emotions. There are currently no similar results. But I think this is a future direction, because exercise control is an organic combination of multiple actions. Its abstract and semantic representation may be a promising direction. At present, there are similar signs and early work in the early stage of work. It makes sense. “

Talking about the reasons for the choice of deep cultivation of sports control, Liu Libin said, “For the direction of the movement, the exploration of the academic community is still ahead. At present, the effect of its generation cannot meet the needs of the industry. Although it can provide basic control capabilities, its efficiency can be efficient. There is still a lot of distance from the actual needs of the industry. There is a lot of room for research in this direction. “

The current work will not model the environment, but in the future, exercise control may need to interact with the physical environment. “We will consider adding the steps of environmental physical modeling to increase its authenticity.”


“In the field of sports control, people are currently focusing on multi -skill learning. For example, in the types of skills such as confrontation (fighting) and collaborative (dancing), a combination of multiple skills is involved.” Multi -skill learning is not only useful for entertainment, but also It is also very useful in the fields of intelligent driving, service robots.

Liu Libin believes that skills migration will be a potential research hotspot in the future. For example, after getting some control experience, how to use existing knowledge for better partial parts and learning? After the robot learns to balance the skills, how to use the balanced skill when the empty flip movement after learning? Because the rear turning activity is also involved in the state of balance. “This is a bit like the pre -training model of NLP. For action control, we can also conduct similar research, which can be called the ‘digital brain’.”

“It can be said that we are currently studying and developing the artificial intelligence of the cerebellum. The brain part is more language and vision. In the future, most of these two may be more fusion, so as to bloom more beautiful sparks. The cerebellum part is still developing, especially the learning and expansion of multi -skill collection. I believe that one day we can achieve complete digital brain. “

4 challenge

Although computer graphics has reached real life in technology applications, there is still basic challenges.

“Geometry still has a goal that is far from achieved at the basic theory, that is, continuous, efficient and unified geometric expression of things that change with time. For example Change, in the process, how to perform geometric expression, while taking into account the attributes and dynamic expressions, it is a big problem. When it is specific to physical and dynamic, the expression of each attribute will be different. In the end The output result. There are many challenges in the engineering system. The engineering system of computer graphics involves sensors, sensor communication, computing, storage, etc., and needs to promote the development of this area. The GPU is an example. “Chen Baoquan said.

In the field of physical simulation, there are still many challenges in multi -physical scenes and multi -scale simulation, and the unimidified phenomena such as phase change, collision, flip, and deformation also brings fundamental difficulties to neural network applications based on gradient learning.


“I don’t agree with the use of deep learning to completely replace the physical formula model, because physicists have conducted a long study of the scene and given the theoretical model approximate. The neural network does not be able to conduct similar scenarios like humans like humans like humans. Similar to the summary and expression of magnitude, it is more commonly restrained. In other words, data -based neural network models usually cannot learn the underlying logic of the physical world, nor can it guarantee the controllability of physical simulation features. “Wang Bin Bin Bin Binbin express.

For the above challenges, one of the current research directions is to use a unified simulation method for description and modeling. “For example, the MPM method is both suitable for the fluid and the simulation of the rigid body, and is recognized by the field. The IPC method can use the energy form of the collision to use instead of the constraint form. The simulation and simplification of the unified solution of the simulation can be solved stably and simply, and ensure that each step is not penetrated, and the operation can be slightly. “

Artificial intelligence does not currently use large -scale use in physical simulation scenarios, but Wang Bin also pointed out that in the future, many difficult problems in the physical simulation system can be solved through AI. Technology. It should be like a tool to solve the difficult steps and problems in the existing system. In the future, I believe that the idea of ​​combining traditional physical modeling and artificial intelligence methods will gradually become the mainstream. “

Wang Bin believes that deep learning may give good supplements in terms of response, because they can usually find an irreplaceable and fast solution solution. “This solution can express the content we want, and the speed is relatively fast. “,”

“For example, in the process of simulation, we usually need to solve some large linear systems, but the condition trees of the matrix are usually not complete. At this time, we need to use some other technologies, such as pre -conditions to get more reliable in this scene. Solution, this is a difficult time for time, it is strongly related to physical issues. At this time, AI may also help us quickly solve the equation, get a suitable pre -component, and solve the pathological equation. “


Sports control is basically experienced -based learning, so it is the same as the problems faced by deep learning. It needs a lot of calculations itself, so the efficiency needs to be considered. The main problem in the application is the quality of the generation. Many tasks still cannot meet the needs of industrial use. “

5 Computer graphics and artificial intelligence

Unlike computer vision, there is almost a trend of comprehensively embracing deep learning. Computer graphics still values ​​the role of priority knowledge. As the two continue to interact, it may bring unpredictable new development.

What is the promotion of computer graphics on artificial intelligence? Chen Baoquan said that it can be divided into two levels.


The first level is the task environment for training and testing for artificial intelligence. “First of all, it provides training data. We can get a lot of simulation data through simulation methods. The acquisition of some data is usually very expensive. Data collection of real world may not meet the training needs. At this time Provide a virtual test environment. In general, we can build the simulation environment to allow the smart body to run training, testing and feedback in it. Such models have been widely used in autonomous driving scenarios. “

The second level is to provide an expression model for the object of the artificial intelligence algorithm itself. For example, a model -based reinforcement learning can be directly learned based on the modeling parameters of the computer graphics as a data input, which reduces the amount of data of learning. “This is equivalent to helping AI to simplify the complexity of the environment, that is, computer graphics has helped AI compress environmental information and extracts the most important factor. At the same time, the model obtained by this learning process is also more intellectual and more more intellectual, and more more, more, more knowledgeable, more, and more Explained. In addition, the virtual environment provided by computer graphics is also more controllable, can control knowledge and difficulty, etc., and avoid unnecessary accidents. “

Computer graphics methods are generally based on constraints and models are used to model. AI is generally data -driven, but the effect they can achieve is different. “If you want to model more complicated objects, you need to decompose the problem. What requires CG, which requires AI, and which needs to be combined.”

Generally speaking, in the initial stage, we tend to decompose the problem with the knowledge of computer graphics. At the node of the problem tree, or at the last mile, the problem becomes not easy to expand modeling. At this time It is necessary to combine the AI ​​method. For example, when we know that the model established is a tree, then we will establish a mother model based on this priority knowledge to make it have the basic characteristics of trees, and then describe the parameters of this tree according to the data.

Similarly, physical modeling has a complete knowledge system, and AI is still in a relatively black box, relying on data learning. What is the relationship between knowledge and data? Wang Bin said, “Knowledge is a model summarized from the data obtained from the data, and the advantages of the data are that the gap with the real world is smaller, so the amount of information is greater. Knowledge is more macro. The biggest difference between people. “

6 The dual play of shape and force

The concept of approaching each other in science cannot be completely independent. Just as the geometric data of lotus can infer its mechanical parameters, the distribution of the magnetic field can infer the shape of the magnetic flow. The shape behind the geometric modeling, physical simulation, and motion control are derived from each other and cannot be separated. In the theory of relativity, the light cone has been rotated at a uniform speed, which causes the bell to slowly scale effects, the quality distorted the light cone, resulting in the free falling effect. Time and space cannot be separated, and quality and space and space cannot be separated.


Although the shape and force cannot cover all the computer graphics, it is not all the cornerstone of the construction of the Yuan universe, but the two must work together in the future and promote each other with artificial intelligence. It is one of the most important experiences in the reality in the Yuan universe – – Touch the world.



Peking University Frontier Computing Research Center Visual Calculation Day

Zhiyuan Star Liu Libin: Let AI approach human sports ability infinitely

Peking University Liu Libin: The Secret of the Supporting Technology of the “Yuan Cosmic” —


Tsinghua University: Research Report of the Development of Yuan Cosmic Universe

From digital cities to digital twin cities

AutoScanning for Coupled Scenes Reconstruction and Proactive Object Analysis


Multi-Robot Collaborative Dense Scenes Reconstruction

Deformation Capture and Modeling of Soft Objects


Solid-Fluid Interaction with Surface-Dominant Contory


A level-set method for Magnetic Substance Simulation

Example-Driven Virtual CineMatography by Learning Camera Behaviors

Learning Skeletal Articulations with Neural Blend Shapes ready.pdf


Unsupervised co-Part Segmentation Through Assembly