blog edit (#1995)

cjyabraham · web-flow · commit 4523c7b2bcea · 2025-04-23T18:15:45.000-07:00
Signed-off-by: Chris Abraham &lt;cjyabraham@gmail.com&gt;
diff --git a/_posts/2025-04-23-pytorch-2-7.md b/_posts/2025-04-23-pytorch-2-7.md
@@ -41,13 +41,13 @@ This release is composed of 3262 commits from 457 contributors since PyTorch 2.6
   <tr>
    <td>
    </td>
-   <td>FlexAttention LLM <span style="text-decoration:underline;">first token processing</span> on X86 CPUs 
+   <td>FlexAttention LLM <span style="text-decoration:underline;">first token processing</span> on x86 CPUs 
    </td>
   </tr>
   <tr>
    <td>
    </td>
-   <td>FlexAttention LLM <span style="text-decoration:underline;">throughput mode optimization</span> on X86 CPUs
+   <td>FlexAttention LLM <span style="text-decoration:underline;">throughput mode optimization</span> on x86 CPUs
    </td>
   </tr>
   <tr>
@@ -135,9 +135,9 @@ For more information regarding Intel GPU support, please refer to [Getting Start
 See also the tutorials [here](https://pytorch.org/tutorials/prototype/inductor_windows.html) and [here](https://pytorch.org/tutorials/prototype/pt2e_quant_xpu_inductor.html). 
 
 
-### [Prototype] FlexAttention LLM first token processing on X86 CPUs
+### [Prototype] FlexAttention LLM first token processing on x86 CPUs
 
-FlexAttention X86 CPU support was first introduced in PyTorch 2.6, offering optimized implementations — such as PageAttention, which is critical for LLM inference—via the TorchInductor C++ backend. In PyTorch 2.7, more attention variants for first token processing of LLMs are supported. With this feature, users can have a smoother experience running FlexAttention on x86 CPUs, replacing specific *scaled_dot_product_attention* operators with a unified FlexAttention API, and benefiting from general support and good performance when using torch.compile.
+FlexAttention x86 CPU support was first introduced in PyTorch 2.6, offering optimized implementations — such as PageAttention, which is critical for LLM inference—via the TorchInductor C++ backend. In PyTorch 2.7, more attention variants for first token processing of LLMs are supported. With this feature, users can have a smoother experience running FlexAttention on x86 CPUs, replacing specific *scaled_dot_product_attention* operators with a unified FlexAttention API, and benefiting from general support and good performance when using torch.compile.
 
 
 ### [Prototype] FlexAttention LLM throughput mode optimization