First stop to Unity S.F., then to Microsoft’s Mixed Reality Capture Studio
This past fall I had the good fortune to be at the right place and at the right time while on a business trip in San Francisco. The first stop was to meet the legendary Tony Parisi, Head of AR/VR for Unity 3D (gaming engine). Tony is well know as the co-founder for an immersive web technology called VRML (Virtual Reality Modeling Language) back in 1997 during the first wave of the internet.
Tony and I had a quick meeting to discuss the Khronos Group’s OpenXR specification for ubiquitous, web browser based augmented and virtual reality distribution. We then discussed Khronos Group’s gLTF, a project Tony worked on deeply to fix a big problem. Until recently the “.JPG” standard for 3D assets portability was not developed until Tony, Khronos Group and members created gLTF the new standard for portable 3D graphics such as the 3D scanned assets that the Microsoft Capture Studio creates. Our mutual goals are to help make the web a 3D immersive medium, where 3D is the interface, and to make this happen a lot of technology and creativity must come together.
Update: Facebook is leading the way for augmented and virtual reality (for now) as of their recent F8 conference, learn more about it here via Tony Parisi’s post.
After the meeting Tony then picked up the phone for a quick call with the team at the Microsoft’s Reactor Studio in San Francisco. They happen to be launching their new Mixed Reality Capture Studios that day and I was invited down to visit in an hour. Thanks Tony!
Entrance to Microsoft Reactor S.F.
Microsoft Reactors are developer community hubs. The best in technical events & community networking & collaboration. They just opened up a new Reactor in Seattle SLU district and are now getting more known as a good place with solid event and meeting infrastructure to host community tech events.
What is Mixed Reality Capture?
The studio produces video holograms, which look like video from any given viewpoint but exist volumetrically in 3D space. Viewers can change their view of a performance at any time, or actually move around the video, in mixed reality experiences.
- 3D capture of people, animals and objects – all shot in a large room with 104 cameras and huge storage devices
- The typical mixed reality capture stage setup has RGB and IR cameras and unstructured static IR laser light sources surround the capture volume
- The system captures color and depth data, then runs custom software to combine the different data to reconstruct the volumetric video file
- Content from this system typically would be used with other assets such as scenes, props, SFX and audio in 3D game engine software such as Unity 3D
- Today and rapidly moving into the future, this technology will enable may types of experiences and business models from fashion ecommerce to sports and celebrity interactions
- Viewers for web, mobile and mixed reality “VR” headset are available to interact and experience new volumetric content
360 Scanning – Looks like Video but is full 3D and viewable at any angle and at any point on the timeline
This technology is obviously cool and early in the development curve. A natural fit for experimentation is of course with music videos. The image below features Billy Corgan, best known as the lead singer, primary songwriter, guitarist, and sole permanent member of The Smashing Pumpkins.
Billy Corgan’s new production “Aeronaut” is one of the first to feature a hologram created with Microsoft Mixed Reality Capture technology, used to capture Corgan’s performance in volumetric video, while incorporating innovative technologies from Google and Unity Technologies. This 2D (Web, TV, Mobile, etc) content was created inside the 3D world that was imagined by San Francisco artist and filmmaker, Danny Bittman and brought to life by the joint Viacom NEXT and Isobar team. By capturing Corgan’s three-and-a-half-minute performance in volumetric video at Microsoft’s Mixed Reality Capture Studios, Isobar, Viacom and Bittman, were able to use the Unity creation engine and Tilt Brush to create a world around him.
We are quickly coming to an inflection point where the entire volumetric media production chain and end-user devices will hit a sweet-spot. Then creatives and entrepreneurs will experiment will all types of new experiences, mixing up reality in new ways ever imagined. Big ideas were covered in the recent Immersed Conference hosted in Portland.
Today, realistically I see we are a few years out where this exotic media asset production system is practical and affordable. The data rates, for example are massive. Tools to polish up the 3D hologram content are still new and need future AI assistance to save time for cleaning up models and imaging shoots.
Check out my post covering the NVIDIA Holodeck Demo to learn about how this technology could tie into live virtual reality collaboration in the future.
Data Rates are Massive 10Gigs Per Second of Raw Data
This version of the system ran 106 synchronized cameras (53 Infrared used to add texture and 53 Red, Green, Blue (RGB) used to capture color and depth data.
The sensors are shooting at 30-60FPS (frames per second). In my opinion we need 120FPS to match how our vision system and brain works. Yes, we are early but with massive data clouds and AI image manipulation holograms will be a reality in the next five years.
The system had a total of four minutes of continuous recording time. This is fine at this stage along the technology curve, because attention spans are short and multiple shoots can be made to produce content.
Output is typically in the range of 40K triangles and 2K texture per character for a VR device, down to 10K triangles and 1K texture for mobile devices.
Projects are deliver in raw .obj/.png files, or a compressed format delivered as a streamable .mp4 file.
The system can compress down to rates typical for HD video. They use h.264 for the mp4 files. Demo captures released with HoloLens ran between 7Mbps-12Mbps. Higher resolution and/or uncompressed formats would be on the order of several hundred MB for a 30sec clip.
Thank you Microsoft & Team!
New metrics and analytics for how users interact with volumetric objects and spaces are need to refine experience production, optimize how goals are met and the best ways to connect these new forms of interactive 3D communications with customers. The new “All-In-One” VR headsets and mysterious holographic modular RED Hydrogen phone and announcement to partner with Facebook for a new virtual reality camera show promise for the near future.
Joining me on the tour was Bruce Bartlof CTO and Principal at Gensler. Gensler is widely recognized as the world’s leading collaborative design firm, not just the largest. It was great to catch up with Bruce and gain insights to the generative design methods he used for the NVIDIA HQ, discuss the latest in AI design and XR for new visual architecture designs.
Hannah Bianchini and the team at Microsoft were superb hosts and followed up with the latest details for the new studios. The current Microsoft Mixed Reality Capture Studios are located in San Francisco, Seattle and London. Connect with me if you have budget and want help to conceive, build and deliver a project using this capture technology to create engaging immersive experiences.