You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/server/README.md
+8-8Lines changed: 8 additions & 8 deletions
Original file line number
Diff line number
Diff line change
@@ -303,23 +303,23 @@ mkdir llama-client
303
303
cd llama-client
304
304
```
305
305
306
-
Create a index.js file and put this inside:
306
+
Create an index.js file and put this inside:
307
307
308
308
```javascript
309
-
const prompt = `Building a website can be done in 10 simple steps:`;
309
+
const prompt = "Building a website can be done in 10 simple steps:"
310
310
311
-
async function Test() {
311
+
async function test() {
312
312
let response = await fetch("http://127.0.0.1:8080/completion", {
313
-
method: 'POST',
313
+
method: "POST",
314
314
body: JSON.stringify({
315
315
prompt,
316
-
n_predict: 512,
316
+
n_predict: 64,
317
317
})
318
318
})
319
319
console.log((await response.json()).content)
320
320
}
321
321
322
-
Test()
322
+
test()
323
323
```
324
324
325
325
And run it:
@@ -381,7 +381,7 @@ Multiple prompts are also supported. In this case, the completion result will be
381
381
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded. The number excludes the BOS token.
382
382
By default, this value is set to `0`, meaning no tokens are kept. Use `-1` to retain all tokens from the prompt.
383
383
384
-
`stream`: It allows receiving each predicted token in real-time instead of waiting for the completion to finish. To enable this, set to `true`.
384
+
`stream`: Allows receiving each predicted token in real-time instead of waiting for the completion to finish (uses a different response format). To enable this, set to `true`.
385
385
386
386
`stop`: Specify a JSON array of stopping strings.
387
387
These words will not be included in the completion, so make sure to add them to the prompt for the next iteration. Default: `[]`
@@ -446,7 +446,7 @@ These words will not be included in the completion, so make sure to add them to
446
446
447
447
**Response format**
448
448
449
-
- Note: When using streaming mode (`stream`), only `content` and `stop` will be returned until end of completion.
449
+
- Note: In streaming mode (`stream`), only `content` and `stop` will be returned until end of completion. Responses are sent using the [Server-sent events](https://html.spec.whatwg.org/multipage/server-sent-events.html) standard. Note: the browser's `EventSource` interface cannot be used due to its lack of `POST` request support.
450
450
451
451
- `completion_probabilities`: An array of token probabilities for each completion. The array's length is `n_predict`. Each item in the array has the following structure:
0 commit comments