ArticlesAll Issue
ArticlesEfficient Path Tracer for the Presence of Mobile Virtual Reality
• SeongKi Kim1 and ManKyu Sung2,*

Human-centric Computing and Information Sciences volume 11, Article number: 16 (2021)
https://doi.org/10.22967/HCIS.2021.11.016

Abstract

Global illumination can achieve photorealistic rendering of mobile virtual reality (VR), and can naturally express shadows, caustics, and color bleeding, ultimatelyenhancing users’ sense of immersion and presence. However, it has high computational complexity because it must simulate all rays to every position in two virtual planes. In addition,the majority of mobile devices have low performance compared to desktops, which makes it difficult to achieve real-time performance in global illumination. This study proposes novel algorithms to realize global illumination for mobile-based VR. The proposed algorithms utilize points wherein many objects are rendered twice from different camera positions, and most recent mobile devices have both a CPU with heterogeneous cores and a GPU. The proposed algorithms help to increase CPU utilization from approximately 13% to 80%, rendering performance by 14.69%,and similarity with the ground-truth image by 39.06%,while decreasing noise by 7.05% at the initial low-sampled status.

Keywords

Virtual Reality, Rendering, Ray Tracing, Mobile Applications

Introduction

Over the last decade, virtual reality (VR) has been actively researched [13], and many applications such as video games and education/therapy contents [4] have been developed based on the VR. However, desktop-based VR applications have the disadvantageof requiring a high-performance desktop. Even with such a device, VR applications need a head-mounted display (HMD) and long wires connected to a desktop; thisinevitably restricts both the user’s movements and the device’s portability.
Meanwhile, mobile devices have become commonplace throughout the world, and are now widely used as a research platform [5, 6]. In addition to these widespread mobile devices, the performance of hardware has evolved dramatically. For example, the performance of the mobile CPU has increased 14.12 times (Nexus 4: 1,568, Pixel 4 XL: 22,135, CPU Mark) and that of the mobile GPU 14.06 times (Nexus 4: 2,647, Pixel 4 XL: 54,280, 3D Graphics Mark) between 2012 and 2019 [7]. Additionally, the resolution of the mobile display has increased 4.45 times (Nexus 4: 768×1280, Pixel 4 XL: 1440×3040), and that of the mobile camera 2 times (Nexus 4: 8 Mpixels, Pixel 4 XL: 16 Mpixels) during the same period.
With these performance improvements, immersive mobile-based HMDs, such as Cardboard, Daydream, and Gear VR, have become available at affordable prices. Compared with desktop-based VR, mobile-based VR offers the advantages of portability and freedom from wires, meaning that it can be used anywhere if there is sufficient physical space. Therefore, more people can use mobile-based VR.
Nevertheless, users must attach a mobile device close to their eyes to use VR content, such as games, simulations, or educational/medical contents; therefore, the quality of actual rendering is critical. If the user sees poorly rendered results, their sense of immersion and presence will decrease considerably, and their satisfaction with the VR content will decrease substantially.
Global illumination is a method of rendering that tracks the indirect light reflected by other virtual objects in a 3D sceneas well as the direct light emitted by light sources, so it satisfies the demand for high-quality rendering; this in turn can greatly enhance the users' sense of immersion and presence. To achieve global illumination, unbiased Monte Carlo path tracing [8] can be used. However, path tracing is a time-consuming algorithm because all rays emanating from light sources and all reflected/refracted rays must be traced in order to render a single scene. In particular, when the path tracing runs on mobile devices, whose performance is limited and which have less memory compared to a desktop, it takes a considerable amount of time to obtain only a single image with reasonable quality. Therefore, this study proposes novel algorithms for a real-time path tracer with fewer noises for mobile VR, and verifies them with experiments on real mobile devices.
The contributions of this paper can be summarized as follows. First, this paper proposes algorithms that can increase the performance of path tracers for mobile-based VR with less noise. The suggested algorithms helped to increase the rendering performance by 14.69%, andmake the rendering quality more similar to the ground-truth image by 39.06% at the initial sampling,while decreasingnoise by 7.05%. Second, this paper proposes a novel stereoscopic rendering algorithm for VR that considers reflection and specular cases, which previous stereoscopic rendering techniques did not consider. Third, this paper proposes algorithms that can fully utilize mobile CPU with heterogeneous cores and GPU simultaneously. Using the proposed algorithms, CPU utilization was increased from approximately 13% to 80%. Fourth, this paper proposes a method of probabilistic ray stopping and filtering based on the user's gaze. Lastly, this study is the first to suggest the use of the path tracer for mobile virtual reality, measure the rendering performance, and illustrate the quality with real mobile devices.
The remainder of this paper is organized as follows: Section 2 reviews related works to facilitate the reader’sunderstanding of this study; Section 3 describes the suggested algorithms and the rationale behind them; Section 4 describes the implementationof the proposed algorithms, theirperformance, and the results of the rendering; and Section 5 presents the conclusion.

Related Works
This section describes prior works and the background to this paper.

Path Tracing
Global illumination can render a 3D scene with photorealistic quality. It tracks the light bouncing off the surface of an object (direct light) and the light diffused, reflected, and refracted by other 3D objects (indirect light) [8] emanating from light sources. The path tracing algorithm is an algorithm that can realize global illumination with MonteCarlo integration, which is performed by casting random rays, in the intersection case, on a diffuse object; and it is the most popular choice today due to its robustness and efficiency.
Among the processes of path tracing, the intersection tests generate significant overheads because a large number of intersections between rays and polygons should be repetitively tested. For the acceleration of these overheads,hardware-based approaches [9, 10] or software-based approaches such as bounding volume hierarchy (BVH) [11] and K-dimensional (KD) tree [23] are widely used. Also, general-purpose computing on graphics processing units (GPGPU) has been used to trace a huge number of rayssimultaneously. Through such efforts, path tracing can now be executed at a fast speed [1316] on a desktop. However, all of these methods have targeted desktops rather than mobile devices. As such, this paper describesapath tracer that targets mobile devices and attempts to fully utilize a mobile CPU with heterogeneous cores and a mobile GPU.

Heterogeneous CPU Cores
Many techniques have been developed to save energy on mobile devices, including the big.LITTLE architecture [17, 18]; in this hardware architecture, the CPU has heterogeneous cores whose level of performance varies. Low-performance (low-energy) cores run the operating system and applications with a low amount of energy if the applications only require low performance, while high-performance (high-energy) cores run them in cases wherethey require high performance. This architecture differs from the multi-cores that are widely used with desktops in that each core has a different performance. This big.LITTLEarchitecture has been widely used with most mobile devices since its development in 2013.
Many types of research [19, 20] have been conducted to increase performance and reduce energy consumption. In [19], the big.LITTLE architecture was used to improve the performance of many applications that used the convolutional neural network (CNN). Wang et al. [19] reported that the performance of CNN applications increased by approximately 20% byimproving the throughput of heterogeneous cores by 39%.The authors of [20] used the function multi-versioning (FMV) and the compiler-based optimization for the big.LITTLE architecture, and verified that the performance of TensorFlow Lite [21] increased by 11.2% and 17.9% on Cortex-A55 and Cortex-A75 on mobile devices, respectively.
The algorithms proposed in this paper try to utilizethis architecture for global illuminationto the maximum extent. Furthermore, they can run on all recent systems with multi/heterogeneous cores besides the mobile-specific big.LITTLE architecture.

Foveated Rendering
In the human visual system, the eye has two different types of photoreceptors: cones and rods [22]. Cones and rods sense color and brightness, respectively. The cones are concentrated in the central area of the retina, i.e.,the fovea (approximately 5.2° around the central optical axis). The density of cones exponentially decreases past the parafovea (approximately 5.2° to 9°) and perifovea (approximately 9° to 17°). The density of rods is highest at the perifovea, and drops linearly in the remaining area. This means that a human recognizes color at the fovea and light at the perifovea with the greatest sensitivity. Outside these areas, the ability to sense color and light decreases exponentially and linearly. This means that the highest quality is not necessarily needed in rendering all the time. Instead, a scene can be rendered with lower quality while minimizing human awareness of it. This adaptive rendering based on the user’s gaze is called foveated rendering [23].
To control the rendering quality based on the gaze position when using the ray/path tracing, many types of research [24, 25] have been conducted because it could increase the rendering performance significantly. The authors of [24] theoretically proved that at least 70% of rays could be omitted at the path tracer on a recent VR device without humans being aware of it. In [25], fewer primary rays were traced in the peripheral regions of vision, and the frame rate was twice as high without any noticeable degradation of image quality for the users.
For the rendering performance, this paper uses the probabilistic ray stop scheme based on the Euclidian distance from the gaze because it is practical as a mobile-based path tracer, and the non-uniform and the sparse sampling techniquesbased on the tracked gaze position in [25] can strengthen the aliasing and flickering effects because the area for central vision has more samples and the area for peripheral vision has fewer samples. To reduce the high variance caused by stopping the ray early, the proposed algorithms remove noise if a ray stops early.

Algorithm Overview
This section describes the basic ideasand rationale behind the proposed algorithms.

Heterogeneous Cores within Mobile Devices
To increase the rendering performance through heterogeneous CPU cores within mobile devices, the sequences were designed as illustrated in Fig. 1. As shown in Fig. 1, the algorithms try to run the path tracing on all of the available CPU cores separately, while the GPU runs independently.
Fig. 1. Overall sequences of each hardware component.

In Fig. 1, a red arrow shows the flow, and a box is a hardware component or a process that the hardware performs. CPU core 1 initiates the GPU and periodically merges the results from the CPU and the GPU. Other than CPU core 1, all other cores trace paths. The difference in the path tracing between the CPU and the GPU is that the CPU cores independently process all pixels in the plane, but each core in the GPU processes a single pixel in the plane because all of the mobile GPU cores have the same hardware architecture,unlike the mobile CPUwhich uses the big.LITTLE architecture.If each CPU core independently processes the path tracing of a single pixel, the fast CPU core should wait for the slow CPU core to merge the results into a single buffer. To avoid this synchronization overhead, different amounts of tasks were given to the CPU and the GPU.
More specifically, the roles of core 1 are to make path tracing run on the GPU periodically, check the execution at the GPU and CPU, merge all colors from the CPU and GPUif the tracing finishes, and then send the merged results to the GPU for the next sampling. The roles of other CPU cores are to run the path tracing on all pixels in the plane repeatedly and to notify the results to core 1.
Core 1 is designated to merge all of the results from the CPU and GPU because of the synchronization problem. If cores 2 and 3 try to merge their results into a single buffer at the same time, two cores should communicate to avoid the writing conflict. To avoid this synchronization problem, only core 1 merges the results, whereas the other cores fill and notify. With these workload subdivisions, the suggested algorithms try to maximizethe CPU utilization as well as the GPU without any additional overhead.

Stereo Rendering with Spatial Reprojection While shading the left and right planes, the proposed algorithms use the previously shaded colors from the right and left planes. In other words, the results for the left and right planesare spatially re-projected into the next right and left planes. These additional re-projections add more samplings, and these additional sampled results affect the next plane cumulatively. As a result, the algorithms can produce more converged results from these additional samplings even with the same sampling count as the original path tracing.
The re-projection from the left plane to the right plane is shown in Fig. 2.
Fig. 2. Reprojection from the left plane to the right plane.

In Fig. 2, the proposed algorithm stores the intersected point Pc and its resulting color and uses the point and color when rendering the pixel Pr in the right plane. To shade the color from the left plane to the right plane, the algorithm must identify point Pr in Fig. 2. Considering Fig. 2 in three dimensions and assuming that n is the normal vector of the right plane, Equation (1) is satisfied.

(1)

Furthermore, Pr can be defined in parametric form with a vector, as shown in Equation (2) below.

(2)

By combining Equations (1) and (2), Equation (3) can be obtained.

(3)

From Equation (3), t can be found by using Equation (4).

(4)

Then, from Equations (2) and (4), Equation (5) can be used to find Pr.

(5)

Although the point Pr can be found by using Equation (5), it may lie outside the plane, or other 3D objects may block the ray to the right plane; therefore, point Pc may be invisible at the right eye’s position Pre. To test the visibility, the algorithm checks whether Pr is within the plane and whether there are no other obstacles between the ray and the right plane.
In Fig. 2, the color of Pl can be reprojected to that of Pr if the material property of Pc is diffuse. When the material has only a diffuse property, it reflects raysin every direction, but if the material has a reflective property, it reflects in a fixed single direction. In this case, the algorithm does not create a ray from Pc to P_re; rather, it creates a ray from Pc in the direction of vector Vrefl; The Pr efl can be calculated through Equation (6), which can be found with the same procedure as that used for Equation (5).

(6)

Then, the algorithm checks whether P_refl lies outside a plane or whether other 3D objects are blocking the ray. Although the idea of reprojection was suggested in [26], that paper did not handle the reflection case.

Probabilistic Ray Stopping As a method of utilizing the characteristics of the human visual system in which the ability to perceivecolor and brightness decreases according to the distance from a gaze position [22], the probability of stopping ray-bouncing was increased according to the distance from the gaze position. In this study’s approach, the stopping probability of ray bounces increases linearly based on Equation (7) as a pixel moves further away from the fovea.

(7)

where d_max and drepresent the maximum distance from a gaze point within a plane and the actual distance from a gaze point to the currently rendered position, respectively. P_d is the probability that a ray stops when it bounces. This probability increases linearly as the traced pixel moves further from the gaze point. A maximum probability, P_max, is used to avoid the problem of all rays being stopped, especially at the border area. This probability-based stopping was used only after the depth threshold to avoid a case in which an unexpected color could be renderedafter a ray has been stopped at the initial stage while tracing a ray.

Noise Removal for the Stopped Ray Path tracing inherently generatesnoise due to variance until a pixel color converges to a fixed value with reasonable quality; convergence is computationally expensive, and users can perceive noise beforehand, which can decrease the immersive feeling so critical for VR. In addition, more rays stop bouncing when a pixel is far from the gaze point based on Equation (7), which can produce a more divergent result for the same period, especially at the outer layer of the fovea area. To mitigate this problem, an additional noise-removal algorithm can be applied when the ray stops early. However, when a traced ray intersects with a triangle or a sphere that has a reflective or refractive property, noise should not be removed because important caustic effects can also be removed by the filter. To avoid these effects, the filter should be applied if the ray intersects with only the diffuse materials while being traced.

Algorithm Details and Results

Based on the overview of the algorithmspresented in Section 3, Algorithms 1–5 are designed for the mobile CPU and the GPU. The algorithms use the color, pixel, and reprojection buffers; the color buffer, Bc, is used by each CPU core and the GPU to store the shaded results temporarily. The pixel buffer, Bp, is used to storethe result, and alltheCPU cores and the GPU share the pixel buffer. The reprojection buffer, Br, is used to send the calculated colors to another plane, and each CPU core, as well as the GPU, has a reprojection buffer.

Algorithms in Detail
Algorithm 1 merges the reprojection buffers that are produced by the previous execution at the CPU and the GPU.
At line 1, Algorithm 1 waits for a signal from Algorithm 4. Algorithm 1 reads the re-projection buffer, Brg, from the GPU and merges it into a single color buffer, Bcm, at lines 2 and 3. The algorithm checks whether each CPU core, ci, also finishes filling the reprojection buffer Brci through Algorithm 2, and then merges Brc_iinto Bcm at lines 4 to 8.

Algorithm 1. CPU-side algorithm that merges the re-projection buffers
Input: Reprojection buffers, ($B_r$), by CPU/ GPU.
Output: A single color buffer, ($B_c^m$), that integrates all of the re-projection buffers

1: Wait for the event to be signaled by Algorithm 4
2: Read $B_r^g$ produced by Algorithm 4
3: Merge $B_r^g$into $B_c^m$
4: $for$$c_i∈all cores in the CPU do 5: if$$c_i$ finishes filling Bcci $then$
6:       Merge Brci into $B_c^m$
7:    endif
8: endfor
The CPU-side path tracing is described in Algorithm 2executed by the other high/low-performance coresexcept the main core,which should run Algorithm 3. Algorithm 2 calls Algorithm 3 at every pixel in the plane.

Algorithm 2. CPU-side algorithm that performs the path tracing
Input: 3D objects, camera/gaze position, depth threshold, T_d
Output: Color buffer, Bcci, for the current plane and re-projection buffers, Brci, for the next plane

1: $forP_l$∈all pixels in the current plane do
2:    Call Algorithm 3 with Bcciand Brci
3: endfor

Algorithm 3 is the modified path tracing and fills both a color buffer for the current plane and a re-projection buffer to be delivered to another plane.

Algorithm 3. Modified path tracing that is commonly used by Algorithms 2 and 5
Input: 3D objects, camera/ gaze positions, pixel position, P_l, depth threshold T_d
Output: Color buffer, B, for the current plane and re-projection buffers, B_r^(c_i ), for the next plane

1: cur_ray generate a ray from the camera to $P_l$
2: depth 0
3: while the stop condition is not satisfied do
4:    Find the nearest object hit, Odmin, with cur_ray
5:    ifOdmin does not exist then
6:       Fill $B_c$[$P_l$] with black, and break
7:    end if
8:    Shade the color based on the property of Odmininto $B_c$[$P_l$]
9:    if depth = 0 then
10:      Store the first hit point, $P_c$
11:      ifOdmin has a diffuse property then
12:         Generate a ray, next_ray, from $P_c$ to the camera of the next plane
13:         Find the hit point, $P_r$, with Equation (5) between next_ray and the next plane
14:         if$P_r$ exists and next_ray is not blocked by other 3D objects then
15:            Store $P_r$
16:         end if
17:      end if
18:      ifOdminhas the reflective property then
19:        Generate a ray, next_ray, from $P_c$ in the reflected direction
20:        Find the hit point,$P_refl$, with Equation (6) between next_ray and the next plane
21:         ifP_reflexists and next_ray is not blocked by other 3D objects then
22:            Store $P_refl$
23:         end if
24:      end if
25:   end if
26:   if Odmin has the diffuse property then
27:      cur_ray ← generate a new ray in a random direction
28:   else if Odmin has reflection/refraction properties then
29:      cur_ray ← generate a new ray in the reflection/refraction direction
30:   end if
31:   if depth >$T_d$ then 32:      Calculate the stopping probability with Equation (7) 33:      Stop the ray with $P_d$ 34:   end if 35:   depth ← depth + 1 36: end while 37: Fill $B_c$[$P_l$] with the shaded color 38: if either $P_r$ or P_reflis stored then 39:    Fill $B_r$[$P_r$ or $P_refl$] with the shaded color 40: end if

Algorithm 3 tries to find the nearest intersected object after generating a ray. If found,it generates a ray to the second camera, finds the point of intersection with the second plane (lines 9–25), and then stores the point for later use. The path tracing can be stopped when the stopping condition is satisfied by line 3 or with the probability shownin Equation (7) by line 33. When it is stopped, Algorithm 3 fills the shaded color into the color buffer for the current plane during the loop (line 37) and stores the color buffer for another plane in line 39.
Algorithm 4 is executed by the CPU when a plane needs to be rendered.

Algorithm 4. CPU-side algorithm that starts the GPU
Input: Color buffer, $B_c^m$, merged by Algorithm 1 and color buffer, $B_c$, by the CPU cores
Output: Pixel buffer, $B_p$, of the current plane and color/re-projection buffer, $B_c$/$B_r$, for the next plane

1: Check that Algorithm 1 finishes merging
2: Deliver $B_c^m$ to Algorithm 5
3: Deliver $B_c$ to Algorithm 5
4: Call Algorithm 5 on the GPU
5: Read B_p with Algorithm 5 on the GPU
6: Signal the event that makes Algorithm 1 run

The role of Algorithm 4 is to deliver the merged color buffer and the calculated color buffers created by the CPU cores to Algorithm 5 on the GPU and to fill the pixel buffer for the current plane and the re-projection buffers for the next plane.
Algorithm 5 is executed by the GPU, merges the color buffers from the CPU cores, fills values in the pixel buffer of the current plane, and delivers the re-projection buffer to Algorithm 4.Algorithm 5 removes the noises as well.

Algorithm 5. GPU-side algorithm that performs the path tracing
Input: 3D objects, camera/gaze position, depth threshold, T_d,merged color buffer, B_c^m, from Algorithm 1, color buffers by the CPU cores, B_c, color buffer for the GPU, B_c^g, and re-projection buffers, B_r^g
Output: Pixel buffer, B_p, of the current plane and reprojection buffer, B_r, of the next plane

1: Find the pixel position, $P_l$
2: $B_c^g$[$P_l$] ← $B_c^m$[$P_l$]
3: Call Algorithm 3 with $B_c^g$ and $B_r^g$
4: for$c_i$∈all cores in the CPU do
5:    ifc_ifinishes filling B_c^(c_i )then
6:       Merge Bcci[$P_l$] into $B_c^g$[$P_l$]
7:    endif
8: endfor
9: Calculate $B_p$[$P_l$] from $B_c^g$[$P_l$]
10: if the primary ray does not intersect a reflective or refractive material then
11:   if the ray is stopped early then
12:      Apply the noise removal algorithm
13:   endif
14: endif

Results
All the algorithms were implemented with Android Studio 3.4.2 and Google’s VR SDK 1.160.0 and ran on a Galaxy S10+ with an Exynos 9820 (2 custom cores + 2 Cortex-A75 + 2 Cortex-A55, Mali-G76 MP12 600 MHz, and 8 GB memory). The device uses Android 9.0 (Pie) as the platform and supports OpenCL [27] for the GPGPU acceleration. OpenCL was used to run Algorithm 5 on the GPU, and KD-tree was usedto accelerate the intersection tests. As a noise removal algorithm, this study used the Median filter,which chooses a median value of 8 neighboring pixels, because it is widely used and it shows a good effect in the test cases.
The rendered scenes included the Cornell box scene with five planes, one cube, one table, two triangles as lights, and two spheres with reflective and refractive properties as the default. The ant, teapot, bunny, monkey, and knot models were also included. The numbers of points, spheres, and triangles are listed in Table 1.

Table 1. Detailed information of the rendered models
Model Points Spheres/Triangles
Teapot on a table 966 2/322
Ant on a table 2982 2/994
Bunny on a table 3090 2/1030
Monkey on a table 3153 2/1051
Knot on a table 8046 2/2682

Because an eye-tracking device is not yet available for mobile devices, and the user cannot touch a mobile device within the mobile HMD, it was assumed that the user always concentrates on the center of the plane. However, the algorithm allows the gaze position to be changed at any time.
For the implementation of the algorithms, the SmallPTGPU [28] is used. It is an open-source project that implements the path tracer on the desktop with the OpenCL, and it was significantly modified for the mobile VR. To increase understandability of the performance data and the figures in this section, a result video is created. Please refer to the accompanying video (https://youtu.be/ekytN5CvGAE).

CPU utilization
To check whether the suggested algorithms use the heterogeneous cores fully within the mobile CPU, the CPU usage was profiled with the Android Studio while running thealgorithms described in Section 4.1. Fig. 3 shows the profiled results without and with the proposed algorithms.
As shown in Fig. 3, when the proposed algorithms are used, the mobile CPU is utilized more. In Section 3.1, the core 1 is designated to merge the previous results and initiate the GPU. The proposed algorithms also use the other CPU cores to trace additional rays. By applying these algorithms, the proposed algorithms increased the CPU utilization from approximately 13% to 80% on average.

Fig. 3. Profiled CPU utilization while running path tracing without/with the proposed algorithms that use heterogeneous cores at the same time: (a) is profiled without the proposed algorithms for the heterogeneous cores and (b) is profiled with them.

Rendering quality The path tracing and the suggested algorithms were executed while increasing the x position by 5 pixels and rendering for a 640×480-pixel plane, as shown in Fig. 4. 1.0 was set as the depth threshold for Algorithms 3 and 0.9 was used as the maximum probability, P_max, for the stop condition in Equation (7) heuristically. These results were obtained after sampling 100 times. The image on the left is the results of the VR display obtainedby unmodified pathtracing, while those on the right are the results with the proposed algorithms. Fig. 4 shows that the algorithms improve the rendering quality thanks to the extra samplings from another plane and the additional samplings bytheCPU cores; as a result, it provides a more converged result even at the same sampling rate.

Fig. 4. Rendering comparisons after 100 samples: the results obtained by the original path tracing (a) and the proposed algorithms (b). Rendering with the proposed algorithms produced more converged results and less noise with the same sampling count.
The reduced noise is illustrated in Fig. 5, which is a magnification of Fig. 4. The images on the left of the figures are also those withthepathtracing, while those on the right are with the suggested algorithms.
In Fig. 5, the ray terminates early, but the filtering algorithm removes any noise with high probability. As a result, more noise is removed, and the rendering quality is improved.

Fig. 5. The results obtained by path tracing (a) and the proposed algorithm (b). The rendering with the proposed algorithms produced less noise and more sampling.

Besides the image comparison, the mean square error (MSE) and the structural similarity index (SSIM) were measured. These metricsare widely used to measure the performance of noise removal. After running the path tracing 100 times, the resulting image was used as a ground-truth. The image was compared with the image that runs the path tracing oncewith/without the proposed algorithms. The results are summarized in Table 2.
In Table 2, the proposed algorithms decreasenoises by 7.05% and increase similarities by 39.06% on average when compared with the ground-truth image. In all cases, the proposed algorithms improve quality by additional samplings, re-projection, and noise removal.

Table 2. Noise removal/similarity with and without the algorithms
Scene Path tracing Proposed algorithms Percentage increase (%)
MSE SSIM MSE SSIM MSE SSIM
Teapot 2157.72 0.13 1994.47 0.18 7.57 38.46
Ant 2141.26 0.13 1971.1 0.18 7.95 38.46
Bunny 2139.19 0.13 1999.16 0.18 6.55 38.46
Monkey 2269.04 0.12 2124.44 0.17 6.37 41.67
Knot 2181.45 0.13 2031.54 0.18 6.87 38.46

Rendering performance
Rendering performance was also measured. Path tracing and the proposed algorithms were executed 100 times, the averages of which are listed in Table 3.
Table 3 shows that the algorithms improve the rendering performance in all cases, with the average improvement being 14.69%. The proposed algorithms improve performance through the early termination of rays based on the gaze position, even with noise removal.

Table 3. Performance results with and without the algorithms
Scene Path tracing (ms) Proposed algorithms (ms) Percentage increase (%)
Teapot 236 210.39 10.85
Ant 328.96 293.73 10.71
Bunny 330.55 274.12 17.07
Monkey 379.73 329.07 13.34
Knot 609.95 478.82 21.5

Conclusion

This paper presents novel and practical algorithms for global illumination on mobile devices. The algorithms exploit the next points: VR needs to render 3D objects at only slightly different camera positions two times; users' sensitivity to color and brightness decrease around the gaze point; the recent mobile devices have CPU with heterogeneous cores and GPU. The proposed algorithms were verified on real mobile devices, and the results show that the CPU was utilized up to approximately 80%, rendering performance was improved by 14.69% with 39.06% similarity to the ground-truth image, and there was 7.05% less noise at the initial sampling.
However, current researcheshave the following drawbacks: it will be necessary to conduct further experiments with devices with an eye-tracking capability,as well as user studies with various filtering algorithms to verify their practicality. Furthermore,it will be necessary to verify the proposed algorithms on recent dedicated devices such as Oculus Quest. Finally, another issue is that more research will be required to improve the performance of path tracing on mobile devices. The authors of this study plan to address these issues at a later date when mobile-based eye-tracking devices forvirtualreality become available.

Acknowledgements

All ofthe implementations are available at https://github.com/seongkikim/VR_PathTracingWithGPU. To increase the understandability of all the results presented in this paper, the authors have created a video and uploaded it to https://youtu.be/ekytN5CvGAE.

Author’s Contributions

Conceptualization, SKK. Writing-originaldraft, review,editing, SKK, MKS.

Funding

This research was supported by the National Research Foundation of Korea (No. NRF-2017R1A1A1A05069806).

Competing Interests

The authors declare that they have no competing interests.

Author Biography

Name : SeongKiKim
Affiliation : Division of SW Convergence, Sangmyung University, Seoul, Korea
Biography : SeongKi Kim is an assistant professor at Sangmyung University and hisresearchareaincludemixedreality.

Name : ManKyu Sung
Affiliation : Dept. of Game & Mobile, Keimyung University, Daegu, Korea
Biography : Mankyu Sung is an associate professor at Keimyung University whose research focuses on computer graphics.

References

[1] S. Park, S. Cho, J. Park,K. Huang, Y. Sung, andK. Cho,“Infrared bundle adjusting and clustering method for head-mounted display and Leap Motion calibration,” Human-centric Computing and Information Sciences, vol.9, article no. 8, 2019.https://doi.org/10.1186/s13673-019-0169-6
[2] H. Wu, W. Luo, N. Pan, S. Nan, Y. Deng, S. Fu, andL. Yang,“Understanding freehand gestures: a study of freehand gestural interaction for immersive VR shopping applications,”Human-centric Computing and Information Sciences, vol.9, article no. 43, 2019.https://doi.org/10.1186/s13673-019-0204-7
[3] F. Zhang, T. Y. Wu, J. S. Pan, G. Ding, andZ. Li,“Human motion recognition based on SVM in VR art media interaction environment,”Human-centric Computing and Information Sciences,vol.9, article no. 40, 2019. https://doi.org/10.1186/s13673-019-0203-8
[4] S. Kim, J. Ryu, Y. Choi, Y. Kang, H. Li, and K. Kim, “Eye-contact game using mixed reality for the treatment of children with attention deficit hyperactivity disorder,” IEEE Access, vol. 8, pp. 45996-46006, 2020.
[5] F. Guan, A. Xu, andG. Jiang,“An improved fast camera calibration method for mobile terminals,”Journal of Information Processing Systems,vol. 15, no. 5, pp. 1082-1095, 2019.
[6] W. Jia, Q. Hua, M. Zhang, R. Chen, X. Ji, andB. Wang,“Mobile user interface pattern clustering using improved semi-supervised kernel fuzzy clustering method,”Journal of Information Processing Systems,vol.15,no.4,pp. 986-1016,2019.
[7] PassMark Software,“AndroidBenchmarks,”2020[Online].Available: https://www.androidbenchmark.net.
[8] J.T.Kajiya,“The rendering equation,”ACM SIGGRAPH Computer Graphics,vol.20,no.4,pp. 143-150.
[9] K. Rajan, S. Hashemi, U. Karpuzcu, M. Doggett, and S. Reda, “Dual-precision fixed-point arithmetic for low-power ray-triangle intersections,” Computers & Graphics, vol. 87, pp. 72-79, 2020.
[10] Y. Deng, Y. Ni, Z. Li, S. Mu, and W. Zhang, “Toward real-time ray tracing: a survey on hardware acceleration and microarchitecture techniques,”ACM Computing Surveys,vol.50,no.4,pp. 1-41,2017.
[11] C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, and D. Manocha, “Fast BVH construction on GPUs,”Computer Graphics forum,vol.28,no.2, pp. 375-384, 2009.
[12] I. Wald andV.Havran,“On building fast kd-Trees for Ray Tracing, and on doing that in O(N log N),”inProceedingsof2006 IEEE Symposium on Interactive Ray Tracing, Salt Lake City, UT, 2006, pp. 61-69.
[13] I. Georgiev, T. Ize, M. Farnsworth, R. Montoya-Vozmediano, A. King, B. V. Lommel, et al., “Arnold: a brute-force production path tracer,”ACM Transactions on Graphics,vol.37,no.3, article no. 32, 2018. https://doi.org/10.1145/3182160
[14] RebusFarm,“Octane render farm,”2020[Online].Available: https://kr.rebusfarm.net/en/3d-software/octane-render-farm.
[15] LuxCoreRender,“Major features,” 2020[Online].Available: https://luxcorerender.org/.
[17] ARM, “big.LITTLE technology: the future of mobile,”2013 [Online]. Available: https://img.hexus.net/v2/press_releases/arm/big.LITTLE.Whitepaper.pdf.
[18] S. Kamdar and N. Kamdar, “big.LITTLE architecture: heterogeneous multicore processing,” International Journal of Computer Applications, vol. 119, no. 1, pp. 35-38, 2015.
[19] S. Wang, G. Ananthanarayanan, Y. Zeng, N. Goel, A. Pathania, and T. Mitra, “High-throughput CNN inference on embedded ARM Big.LITTLE multicore processors,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 10, pp. 2254-2267, 2020.
[20] J. Park, Y. Kwon, Y. Park, and D. Jeon, “Microarchitecture-aware code generation for deep learning on single-ISA heterogeneous multi-core mobile processors,” IEEE Access, vol. 7, pp. 52371-52378, 2019.
[21] TensorFlow, “For mobile & IoT,” 2020[Online].Available: https://www.tensorflow.org/lite.
[22] M.Weier, M. Stengel, T. Roth, P.Didyk, E.Eisemann, M.Eisemann, et al.,“Perception-driven accelerated rendering,”Computer Graphics forum,vol. 36,no. 2,pp. 611-643, 2017.
[23] A. Patney, M. Salvi, J. Kim, A. Kaplanyan, C. Wyman, N. Benty, D. Luebke, and A. Lefohn, “Towards foveated rendering for gaze-tracked virtual reality,”ACM Transactions on Graphics,vol. 35,no. 6,article no. 179, 2016. https://doi.org/10.1145/2980179.2980246
[24] M.Koskela, T.Viitanen, P. Jaaskelainen, andJ.Takala,“Foveated path tracing: a literature review and a performance gain analysis,”inAdvances in Visual Computing. Cham, Switzerland: Springer, 2016, pp. 723-732. https://doi.org/10.1007/978-3-319-50835-1_65
[25] A.Siekawa, M.Chwesiuk, R.Mantiuk, and R.Piorkowski,“Foveated ray tracing for VR headsets,”inMultiMedia Modeling. Cham, Switzerland: Springer, 2019, pp. 106-117. https://doi.org/10.1007/978-3-030-05710-7_9
[26] W. Lee, S. J. Hwang, Y. Shin, J. Yoo, and S. Ryu, “Fast stereoscopic rendering on mobile ray tracing GPU for virtual reality applications,” in2017 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, 2017, pp. 355-357.
[27] Khronos Group, “The OpenCL specification,” 2019 [Online]. Available: https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_API.html.
[28] D. Bucciarelli, “SmallptCPU vs SmallptGPU,”2020[Online].Available:http://davibu.interfree.it/opencl/smallptgpu/smallptGPU.html.

SeongKi Kim1 and ManKyu Sung2,*, Efficient Path Tracer for the Presence of Mobile Virtual Reality, Article number: 11:16 (2021) Cite this article 4 Accesses